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MODIFIED ZINC FINGER BINDING PROTEINS 

CROSS REFERENCE TO RELATED APPLICATIONS 
This application claims the benefit of U.S. provisional patent application Serial No. 
60/263,445 filed January 22, 2001 and also claims the benefit of U.S. provisional patent 
application Serial No. 60/290,716 filed May 1 1, 2001; both of which disclosures are hereby 
incorporated by reference in their entireties. 

TECHNICAL FIELD 
The methods and compositions disclosed herein relate generally to the field of 
regulation of gene expression and specifically to methods of modulating gene expression 
by utilizing polypeptides derived from zinc finger-nucleotide binding proteins. 

BACKGROUND 

Sequence-specific binding of proteins to DNA, RNA, protein and other molecules 
is involved in a number of cellular processes such as, for example, transcription, 
replication, chromatin structure, recombination, DNA repair, RNA processing and 
translation. The binding specificity of cellular binding proteins that participate in 
protein-DNA, protein-RNA and protein-protein interactions contributes to development, 
differentiation and homeostasis. Alterations in specific protein interactions can be 
involved in various types of pathologies such as, for example, cancer, cardiovascular 
disease and infection. 

Zinc finger proteins (ZFPs) are proteins that can bind to DNA in a sequence- 
specific manner. Zinc fingers were first identified in the transcription factor TFIIIA from 
the oocytes of the African clawed toad, Xenopus laevis. A single zinc finger domain of 
this class of ZPFs is about 30 amino acids in length, and several structural studies have 
demonstrated that it contains a beta turn (containing the two invariant cysteine residues) 
and an alpha helix (containing the two invariant histidine residues), which are held in a 
particular conformation through coordination of a zinc atom by the two cysteines and the 
two histidines. This class of ZFPs is also known as C2H2 ZFPs. Additional classes of 
ZFPs have also been suggested. (See, e.g., Jiang et al. (1996) J. Biol. Chem. 271:10723- 
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10730 for a discussion of Cys-Cys-His-Cys (C3H) ZPFs.) To date, over 10,000 zinc 
finger sequences have been identified in several thousand known or putative transcription 
factors. Zinc finger domains are involved not only in DNA recognition, but also in RNA 
binding and in protein-protein binding. Current estimates are that this class of molecules 
5 will constitute about 2% of all human genes. 

Most zinc finger proteins have conserved cysteine and histidine residues that 
tetrahedrally-coordinate the single zinc atom in each finger domain. In particular, most 
ZFPs are characterized by finger components of the general sequence: -Cys-(X)2-4-Cys- 
(X)i 2 -His-(X) 3 -5-His (SEQ ID NO: 1), where X is any amino acid (the C2H2 ZFPs). The 
, .. 10 zinc-coordinating sequences of this most widely represented class contain two cysteines 
• Q and two histidines with particular spacings, for example zinc fingers found in the yeast 

fj ; i protein ADRI, the human male associated protein ZF Y, the HIV enhancer protein and the 
t J ; Xenopus protein Xfin have been solved by high resolution NMR methods (Kochoyan, et 
H !i al., Biochemistry, 30:3371-3386, 1991; Omichinski, et al., Biochemistry, 29:9324-9334, 

I, 15 1990; Lee, et al, Science, 245:635-637, 1989). Based on x-ray crystallography, the 

three-dimensional structure of a three finger polypeptide-DNA complex derived from the 
fy mouse immediate early protein zif268 (also known as Krox-24) has been solved. 

% (Pavletich and Pabo, Science, 252:809-817, 1991). The folded structure of each finger 

faff 

ife contains an antiparallel p-turn, a finger tip region and a short amphipathic a-helix. The 

20 metal coordinating ligands bind to the Zn ion and, in the case of zif268 zinc fingers, the 
short amphipathic a-helix binds in the major groove of DNA. In addition, the conserved 
hydrophobic amino acids and zinc coordination by the cysteine and histidine residues 
stabilize the structure of the individual finger domain. 

The folding of a C2H2 ZFP into the proper finger structure can be entirely 
25 disrupted by exchange of the C2H2 ligand amino acids. Miura et al. (1998) Biochim. 
Biophys. Acta 1384:171-179. Furthermore, metal binding specificity of peptides based 
on the C2H2 consensus sequence can be altered. Krizek et al. (1993) Inorg. Chem. 
32:937-940; Merkle et al (1991) J. Am Chem. Soc. 1 13:5450-5451. Although detailed 
models for the interaction of zinc fingers and DNA have also been proposed (Berg, 1988; 
30 Berg, 1990; Churchill, et al., 1990), mutations in finger 2 of the three-fingered C2H2 ZFP 
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zi£268 have been shown to entirely abolish DNA binding (Green et al. (1998) Biochem J. 
333:85-90). 

Nonetheless, increased understanding of the nature and mechanism of protein 
binding specificity has encouraged the hope that specificity of a binding protein could be 
altered in a predictable fashion, or that a binding protein of predetermined specificity 
could be constructed de novo. See, for example, Blackburn (2000) Curr. Opin. Struct. 
Biol. 10:399-400; Segal et al. (2000) Curr. Opin. Chem. Biol. 4:34-39. To this end, 
attempts have been made to modify C2H2 zinc finger proteins. See, e.g., U.S. Patent 
Nos. 6,007,988; 6,013,453; 6,140,081; PCT WO98/53057; PCT WO98/53058; 
PCT WO98/53059; PCT WO98/53060; PCT WO00/23464; PCT WO 00/42219; Choo 
etal. (2000) Curr. Opin. Struct. Biol. 10:411-416; Segal et al. (2000) Curr. Opin. Chem. 
Biol. 4:34-39; and references cited in these publications. 

To date, however, cellular studies using designed C2H2 ZFPs have utilized 
relatively few positions in the zinc finger as adjustable parameters to obtain optimal 
activity. In particular, studies to date have modified only those residues at the finger - 
DNA interface. These have included positions known to make direct base contacts, 
'supporting' or 'buttressing' residues immediately adjacent to the base-contacting 
positions, and positions capable of contacting the phosphate backbone of the DNA. 
Furthermore, many observed effects have been quite modest, and the possibility that 
improved ZFP activities might be achieved via substitution of residues at other positions 
in the finger or using non-C2H2 polypeptides has remained completely uninvestigated. 

Thus, there exists a need for additional designed or selected zinc finger binding 
proteins. 

SUMMARY 

Disclosed herein are binding proteins, particular zinc finger binding proteins, with 
modified metal co-ordination sites. Methods of making and using these proteins are also 
provided. In preferred embodiments, the binding proteins contain three zinc coordinating 
fingers and one or more of these fingers are modified, non-canonical {e.g., non-C2H2) 
finger components. Preferably, the third finger of a three-finger ZFP is modified and 
non-canonical. 
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In one aspect, an isolated, non-canonical zinc finger binding protein comprising one or 
more non-canonical zinc finger components that bind to a target sequence is provided. The 
isolated zinc finger binding protein can be provided as a nucleic acid molecule or as a 
polypeptide. Furthermore, the target sequence can be an amino acid, DNA (e.g., promoter 
5 sequence) or RNA and, additionally, may be in a prokaryotic {e.g., bacteria) or eukaryotic cell 
(e.g., plant cell, yeast cell, fungal cell, animal such as human). In certain embodiments, the 
amino acid sequence of one or more of the zinc finger components is X 3 -B-X 2 . 4 -Cys-Xi 2 -His- 
X!. 7 -His-X 4 ; X 3 -Cys-X2-4-B-X 12 -His- X 1 . 7 -His-X 4 ; X 3 -Cys-X 2 - 4 -Cys-X 12 -Z- X^-His-X^ 
X 3 -Cys-X 2 _4-Cys-Xi 2 -His-Xi. 7 -Z-X4; X 3 -B-X M -B-Xi 2 -His-Xi_ 7 -His-X 4 ; X 3 -B~X 2 , 4 -Cys-Xi 2 - 
10 Z-X^-His-X^ X 3 -B-X 2 -4-Cys-Xi2-His-X 1 . 7 -Z-X 4 ; Xs-Cys-X^-B-Xn-Z-Xi.y-His^; X 3 - 
M : C^XM-B-Xn-ffis-Xi.^^^ 

Q 7-Z-X4; X 3 -B-X 2 - 4 -Cys-X 12 -Z-Xi. 7 -Z-X4; X 3 -B-X 2 „ 4 -B-X 12 -His-X 1 . 7 -Z-X 4 ; X 3 -B-X 2 _ 4 -B-X 12 - 
Jl: Z-Xi_ 7 -His-X 4 ; and X3-B-X2-4-B-X12-Z-X1.7-Z-X4, wherein X is any amino acid, B is any 
™%i amino acid except cysteine and Z is any amino acid except histidine. 
jU 1 5 The modified zinc finger proteins described herein can include any number of zinc 

p,j coordinating finger components in which one or more of the zinc finger coordinates are non- 
M* canonical. In preferred embodiments, the ZFP comprises three fingers, wherein one or more 
nj of the finger components is non-canonical. In certain embodiments, the third zinc finger 
Jj* component is non-canonical. In other embodiments, any of the ZFPs described herein 



In other aspects, fusion polypeptides comprising (a) any of the zinc finger binding 
proteins described herein and (b) at least one functional domain are provided. The functional 
domain may be, for example a repressive domain such as KRAB, MBD-2B, v-ErbA, MBD3, 
TR, and members of the DNMT family; an activation domain such as VP 16, p65 subunit of 
25 NF-kappa B, and VP64; an insulator domain; a chromatin remodeling protein; and/or a methyl 
binding domain. 

In other aspects, polynucleotides encoding any of the zinc finger proteins (or fusion 
molecules) described herein are provided. Expression vectors and host cells comprising these 
polynucleotides are also provided. 
30 In yet other aspects, a method of modulating expression of a gene is provided. The 

method comprises the step of contacting a region of DNA with any of the zinc finger 



20 comprise a modified plant ZFP backbone. 
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containing fusion molecules described herein. In certain embodiments, the zinc finger 
binding protein of the fusion molecule binds to a target site in a gene encoding a product 
selected from the group consisting of vascular endothelial growth factor, erythropoietin, 
androgen receptor, PPAR-y2, pi 6, p53, pRb, dystrophin and e-cadherin, delta-9 desaturase, 
delta- 1 2 desaturases from other plants, delta- 1 5 desaturase, acetyl-CoA carboxylase, acyl- 
ACP-thioesterase, ADP-glucose pyrophosphorylase, starchsynthase, cellulose synthase, 
sucrose synthase, senescence-associated genes, heavy metalchelators, fatty acid 
hydroperoxide lyase, polygalacturonase, EPSP synthase, plant viral genes, plant fungal 
pathogen genes, and plant bacterial pathogen genes. (See, also WO 00/41566). The gene may 
in any cell, for example a plant cell or animal (e.g., human) cell. 

In still further aspects, compositions comprising any of the zinc finger proteins (or 
fusion) molecules described herein and a pharmaceutical^ acceptable excipient are provided. 

These and other embodiments will readily occur to those of skill in the art in light 
of the disclosure herein. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 is a graph depicting levels of LCK gene mRNA (normalized to 18S 
rRNA levels) in cells transfected with constructs encoding fusions of the VP 16 activation 
domain with a canonical ZFP (PTP2), a modified ZFP (PTP2(H^C), and a control 
construct (NVF). 

Figure 2 shows VEGF-A levels in the culture medium of cells that had been 
transfected with plasmids encoding non-canonical ZFP fusion proteins comprising a 
VP 16 activation domain, that were targeted to the VEGF gene. Mock indicates 
untransfected cells; empty vector indicates transfection with a DNA construct lacking 
sequences encoding a fusion protein; and C2H2 indicates cells transfected with plasmids 
encoding the canonical C2H2 VOP30A and VOP32B ZFP-VP16 fusion proteins. S, E, 
K, CT, C, GC and GGC indicate non-canonical derivatives of VOP30A and VOP 32B 
containing a C2HC zinc finger, as described in Table 1 . The left-hand bar of each pair 
shows results for VOP30A and its non-canonical derivatives; the right-hand bar of each 
pair shows results for VOP32B and its non-canonical derivative. The C derivative of 
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VOP32B and the GC derivative of VOP30A were not tested. Results are the average of 
two determinations. 

Figure 3, panels A and B, are schematics depicting construction of the YCF3 
expression vector useful in expressing modified ZFPs. 

5 Figure 4 shows the results of analysis of GMT mRNA in RNA isolated from 

Arabidopsis thaliana protoplasts that had been transfected with constructs encoding 
fusion of a transcriptional activation domain with various modified plant ZFPs. Results 
are expressed as GMT mRNA normalized to 1 8S rRNA. AGMT numbers on the 
abscissa refer to the modified plant ZFP binding domains shown in Table 2. Duplicate 

10 TaqMan® analyses are shown for each RNA sample. 

DETAILED DESCRIPTION 

General 

The present disclosure provides isolated, non-canonical zinc finger binding 

1 5 polypeptides (ZFPs), wherein one or more of the zinc finger components differs from the 
canonical consensus sequence of Cys-Cys-His-His {e.g., Cys2~His2). The polypeptide 
can be a fusion polypeptide and, either by itself or as part of such a fusion, can enhance 
or suppress transcription of a gene, and may bind to DNA, RNA and/or protein. 
Polynucleotides encoding non-canonical ZFPs and fusion proteins comprising one or 

20 more non-canonical ZFPs are also provided. Additionally provided are pharmaceutical 
compositions comprising a therapeutically effective amount of any of the modified zinc 
finger-nucleotide binding polypeptides described herein or functional fragments thereof; 
or a therapeutically effective amount of a nucleotide sequence that encodes any of the 
modified zinc finger-nucleotide binding polypeptides or functional fragments thereof, 

25 wherein the zinc finger polypeptide or functional fragment thereof binds to a cellular 
nucleotide sequence to modulate the function of the cellular nucleotide sequence, in 
combination with a pharmaceutically acceptable carrier. Also provided are screening 
methods for obtaining a modified zinc finger-nucleotide binding polypeptide which binds 
to a cellular or viral nucleotide sequence. 

30 Currently, designed and/or selected ZFPs utilize relatively few positions in the 

zinc finger as adjustable parameters to obtain optimal activity. In particular, studies to 
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date have altered only those residues at the finger - DNA interface. See, e.g., U.S. Patent 
Nos. 6,007,988; 6,013,453; 6,140,081 and 6,140,466, as well as PCT WO 00/42219. As 
noted above, the observed effects have been quite modest, and the possibility that 
improved ZFP activities might be accessible via substitution of residues at other positions 
in the finger has not been investigated. 

Accordingly, in one embodiment, modified (e.g., non-canonical) zinc finger 
proteins are described in which the sequence of one or more zinc fingers of the ZFP 
differs from the canonical consensus sequence containing two cysteine (Cys) residues 
and two histidine (His) residues: 

X 3 -Cys-X 2 .4-Cys-X 12 -His-X 1 . 7 -His-X 4 (SEQ ID N0: 2) 
(also known as the "Cys2-His2" or "C2H2" consensus sequence). As zinc coordination 
provides the principal folding energy for zinc fingers, adjustment of zinc coordinating 
residues would appear to provide a ready means for modifying finger stability and 
structure, which could impact on a variety of important functional features of zinc finger 
protein - transcription factors. In particular, features such as cellular half-life, 
interactions with other cellular factors, DNA binding specificity and affinity, and relative 
orientation of functional domains would all be expected to be influenced by residue 
choice at the zinc-coordinating positions. 

Thus, in preferred embodiments, one or more zinc coordinating fingers making up 
the zinc finger protein has any of the following sequences: 

X3-B-X 2 .4-Cys-Xi2-His-Xi. 7 -His-X4 

X 3 -Cys-X 2 . 4 -B-X 12 -His-Xi. 7 -His-X4 

X 3 -Cys-X 2 .4-Cys-X 12 -Z-X 1 .7-His-X 4 

X 3 -Cys-X 2 . 4 -Cys-Xi 2 -His-Xi.7-Z-X4 

XB-B-X^-B-Xn-His-X^-His-^ 

Xa-B-Xz^-Cys-Xn-Z-Xi.T-His-X, 

X 3 -B-X 2 . 4 -Cys-X 12 -His-Xi. 7 -Z-X 4 

X 3 -Cys-X 2 - 4 -B-X 12 -Z-X 1 . 7 -His-X 4 

X 3 -Cys-X 2 ^-B-Xi 2 -His-X 1 . 7 -Z-X 4 

X 3 -Cys-X 2 _4-Cys-Xi 2 -Z-Xi. 7 -Z-X 4 

X 3 -Cys-X 2 . 4 -B-X 12 -Z-X 1 . 7 -Z-X 4 

X 3 -B-X 2 . 4 -Cys-X 12 -Z-X 1 . 7 -Z-X 4 

X 3 -B-X M -B-Xi 2 -His-X 1 . 7 -Z-X 4 

X 3 -B-X 2 ^-B-X 12 -Z-Xi. 7 -His-X 4 

X 3 -B-X 2 . 4 -B-X 12 -Z-X 1 . 7 -Z-X 4 
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where X= any amino acid 

B= any amino acid except cysteine 
Z= any amino acid except histidine 

Additionally, it is preferred that a zinc finger protein comprises at least three zinc 
coordinating fingers and that at least one of these fingers is non-canonical In the standard 
nomenclature for ZFPs, the "first" finger is the N-terminal-most finger of the protein (with 
respect to the other fingers) and binds to the 3 '-most triplet (or quadruplet) subsite in the 
target site. Additional fingers, moving towards the C-terminus of the protein, are numbered 
sequentially. For example, in certain embodiments, a three-finger zinc finger protein is 
provided wherein the first two fingers are of the C2-H2 class but the first or second histidine 
residue in the third finger (and optionally adjacent amino acid residues) is substituted with 
Cys or with Cys and additional amino acids, such as glycine. In other embodiments, a three- 
finger zinc finger protein is provided wherein the first or second cysteine residue in the first 
finger is substituted with histidine or with histidine and additional amino acids such as 
glycine. Furthermore, in certain embodiments, a finger of a zinc finger protein is modified 
such that, in one or more of the fingers, one or more cysteine or histidine residues is replaced 
with a different amino acid such as, for example, serine. In one embodiment, the second 
finger of a three-finger zinc finger protein is modified such that one or both of the cysteine 
residues are replaced with serine (and/or additional amino acids). Additionally, carboxyl- 
containing amino acids, such as, for example, aspartic acid and glutamic acid are substituted 
for cysteine and/or histidine in a zinc finger. Furthermore, ZFPs comprising two or more 
fingers in which more than one finger is modified are also provided. 

Therefore, the ZFPs disclosed herein differ from previously described designed zinc 
finger protein transcription factors in that they comprise at least one zinc-coordinating finger 
that differs from the canonical consensus sequence (Cys-Cys-His-His). It will be readily 
apparent that various combinations of modified zinc fingers can be used in a single protein; 
for example, all of the finger components may be modified using the same or different 
modified zinc fingers. Alternatively, less than all of the fingers can be modified using the 
same or different modified fingers. Furthermore, the non-canonical modified finger 
components described herein can also be used in combination with previously described 
C2H2 ZFP finger components. 

8 
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In additional embodiments, the isolated non-canonical zinc fingers described herein 
are used in fusion proteins, for example fusions of a ZFP DNA-binding domain with 
repression or activation domains or with chromatin remodeling domains. Polynucleotides 
encoding any of the zinc finger proteins, components thereof and fusions thereof are also 
provided. 

The practice of the disclosed methods employs, unless otherwise indicated, 
conventional techniques in molecular biology, biochemistry, genetics, computational 
chemistry, cell culture, recombinant DNA and related fields as are within the skill of the art. 
These techniques are fully explained in the literature. See, for example, Sambrook et al. 
molecular CLONING: a laboratory manual, Second edition, Cold Spring Harbor 
Laboratory Press, 1989; Ausubel et al, CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John 
Wiley & Sons, New York, 1987 and periodic updates; and the series METHODS IN 
ENZYMOLOGY, Academic Press, San Diego. 

The disclosures of all patents, patent applications and publications mentioned herein 
are hereby incorporated by reference in their entireties. 

Definitions 

The terms "nucleic acid," "polynucleotide," and "oligonucleotide" are used 
interchangeably and refer to a deoxyribonucleotide or ribonucleotide polymer in either single- 
or double-stranded form. For the purposes of the present disclosure, these terms are not to be 
construed as limiting with respect to the length of a polymer. The terms can encompass 
known analogues of natural nucleotides, as well as nucleotides that are modified in the base, 
sugar and/or phosphate moieties. In general, an analogue of a particular nucleotide has the 
same base-pairing specificity; i.e., an analogue of A will base-pair with T. 

The terms "polypeptide," "peptide" and "protein" are used interchangeably to refer to 
a polymer of amino acid residues. The term also applies to amino acid polymers in which one 
or more amino acids are chemical analogues or modified derivatives of a corresponding 
naturally occurring amino acid, for example selenocysteine (Bock et al (1991) Trends 
Biochem. Sci. 16:463-467; Nasimefa/. (2000) J. Biol. Chem. 275:14,846-14,852) and the 
like. 

A "binding protein" is a protein that is able to bind non-covalently to another 
molecule. A binding protein can bind to, for example, a DNA molecule (a DNA-binding 

9 



8325-0025 
S25-US1 

protein), an RNA molecule (an RNA-binding protein) and/or a protein molecule (a protein- 
binding protein). In the case of a protein-binding protein, it can bind to itself (to form 
homodimers, homotrimers, etc.) and/or it can bind to one or more molecules of a different 
protein or proteins. A binding protein can have more than one type of binding activity. For 
5 example, zinc finger proteins have DNA-binding, RNA-binding and protein-binding activity. 
A "binding profile" refers to a plurality of target sequences that are recognized and bound by a 
particular binding protein. For example, a binding profile can be determined by contacting a 
binding protein with a population of randomized target sequences to identify a sub-population 
of target sequences bound by that particular binding protein. 
1 0 A "zinc finger binding protein" is a protein or segment within a larger protein that 

binds DNA, RNA and/or protein in a sequence-specific manner as a result of stabilization of 
protein structure through coordination of a zinc ion. The term zinc finger binding protein is 
often abbreviated as zinc finger protein or ZFP. A "canonical" zinc finger refers to a zinc- 
coordinating component (e.g., zinc finger) of a zinc finger protein having the general amino 
1 5 acid sequence: X 3 -Cys-X2.4-Cys-Xi2-His-X 1 .7-His-X 4 where X is any amino acid (also known 
as a C2H2 zinc finger). 

A "modified" zinc finger protein is a protein not occurring in nature that has been 
designed and/or selected so as to comprise a substitution of at least one amino acid, compared 
to a naturally occurring zinc finger protein. Further, a "designed" zinc finger protein is a 
20 protein not occurring in nature whose structure and composition results principally from 
rational criteria. Rational criteria for design include application of substitution rules and 
computerized algorithms for processing information in a database storing information of 
existing ZFP designs and binding data, for example as described in co-owned PCT WO 
00/42219. A "selected" zinc finger protein is a protein not found in nature whose production 
25 results primarily from an empirical process such as phage display. See e.g. , US 5,789,538; 
U.S. 6,007,988; U.S. 6,013,453; WO 95/19431; WO 96/06166 and WO 98/54311. 
Designed and/or selected ZFPs are also referred to as "engineered" ZFPs and can be modified 
according to the methods and compositions disclosed herein (e.g., by conversion to C3H 
and/or to comprise a plant backbone). 
30 The term "naturally-occurring" is used to describe an object that can be found in 

nature, as distinct from being artificially produced by a human. 
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A zinc finger "backbone" is the portion of a zinc finger outside the region involved in 
DNA major groove interactions; i.e., the regions of the zinc finger outside of residues -1 
through +6 of the alpha helix. The backbone comprises the beta strands, the connecting 
region between the second beta strand and the alpha helix, the portion of the alpha helix distal 
5 to the first conserved histidine residue, and the inter-finger linker sequence(s). 

Nucleic acid or amino acid sequences are "operably linked" (or "operatively linked") 
when placed into a functional relationship with one another. For instance, a promoter or 
enhancer is operably linked to a coding sequence if it regulates, or contributes to the 
modulation of, the transcription of the coding sequence. Operably linked DNA sequences are 
N' 10 typically contiguous, and operably linked amino acid sequences are typically contiguous and 
S in the same reading frame. However, since enhancers generally function when separated from 

saw 

H3 the promoter by up to several kilobases or more and intronic sequences may be of variable 

lengths, some polynucleotide elements may be operably linked but not contiguous. Similarly, 
U certain amino acid sequences that are non-contiguous in a primary polypeptide sequence may 

:L 4 1 5 nonetheless be operably linked due to, for example folding of a polypeptide chain. 
U With respect to fusion polypeptides, the term "operatively linked" can refer to the fact 

in ; 

nt that each of the components performs the same function in linkage to the other component as 

Q it would if it were not so linked. For example, with respect to a fusion polypeptide in which a 

ZFP DNA-binding domain is fused to a transcriptional activation domain (or functional 
20 fragment thereof), the ZFP DNA-binding domain and the transcriptional activation domain (or 
functional fragment thereof) are in operative linkage if, in the fusion polypeptide, the ZFP 
DNA-binding domain portion is able to bind its target site and/or its binding site, while the 
transcriptional activation domain (or functional fragment thereof) is able to activate 
transcription. 

25 "Specific binding" between, for example, a ZFP and a specific target site means a 

binding affinity of at least 1 x 10 6 M' 1 . 

A "fusion molecule" is a molecule in which two or more subunit molecules are linked, 

preferably covalently. The subunit molecules can be the same chemical type of molecule, or 

can be different chemical types of molecules. Examples of the first type of fusion molecule 
30 include, but are not limited to, fusion polypeptides (for example, a fusion between a ZFP 

DNA-binding domain and a transcriptional activation domain) and fusion nucleic acids (for 
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example, a nucleic acid encoding the fusion polypeptide described herein). Examples of the 
second type of fusion molecule include, but are not limited to, a fusion between a triplex- 
forming nucleic acid and a polypeptide, and a fusion between a minor groove binder and a 
nucleic acid. 

5 A "gene," for the purposes of the present disclosure, includes a DNA region encoding 

a gene product (see below), as well as all DNA regions that regulate the production of the 
gene product, whether or not such regulatory sequences are adjacent to coding and/or 
transcribed sequences. Accordingly, a gene includes, but is not necessarily limited to, 
promoter sequences, terminators, translational regulatory sequences such as ribosome binding 
10 sites and internal ribosome entry sites, enhancers, silencers, insulators, boundary elements, 
►f replication origins, matrix attachment sites and locus control regions. Further, a promoter can 
S be a normal cellular promoter or, for example, a promoter of an infecting microorganism such 

jn as, for example, a bacterium or a virus. For example, the long terminal repeat (LTR) of 

y retroviruses is a promoter region that may be a target for a modified zinc finger binding 

U 15 polypeptide. Promoters from members of the Lentivirus group, which include such pathogens 
h as human T-cell lymphotrophic virus (HTLV) 1 and 2, or human immunodeficiency virus 

!f; (HIV) 1 or 2, are examples of viral promoter regions which may be targeted for transcriptional 

jfjj modulation by a modified zinc finger binding polypeptide as described herein. 

!?; "Gene expression" refers to the conversion of the information, contained in a gene, 

20 into a gene product. A gene product can be the direct transcriptional product of a gene (e.g. , 
mRNA, tRNA, rRNA, antisense RNA, ribozyme, structural RNA or any other type of RNA) 
or a protein produced by translation of an mRNA. Gene products also include RNAs that are 
modified, by processes such as capping, polyadenylation, methylation, and editing, and 
proteins modified by, for example, methylation, acetylation, phosphorylation, ubiquitination, 
25 ADP-ribosylation, myristilation, and glycosylation. 

"Gene activation" and "augmentation of gene expression" refer to any process that 
results in an increase in production of a gene product. A gene product can be either RNA 
(including, but not limited to, mRNA, rRNA, tRNA, and structural RNA) or protein. 
Accordingly, gene activation includes those processes that increase transcription of a gene 
30 and/or translation of an mRNA. Examples of gene activation processes which increase 

transcription include, but are not limited to, those which facilitate formation of a transcription 
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initiation complex, those which increase transcription initiation rate, those which increase 
transcription elongation rate, those which increase processivity of transcription and those 
which relieve transcriptional repression (by, for example, blocking the binding of a 
transcriptional repressor). Gene activation can constitute, for example, inhibition of 
5 repression as well as stimulation of expression above an existing level. Examples of gene 
activation processes that increase translation include those that increase translational 
initiation, those that increase translational elongation and those that increase mRNA stability. 
In general, gene activation comprises any detectable increase in the production of a gene 
product, preferably an increase in production of a gene product by about 2-fold, more 
10 preferably from about 2- to about 5-fold or any integral value therebetween, more preferably 
Jf: between about 5- and about 10-fold or any integral value therebetween, more preferably 

O between about 10- and about 20-fold or any integral value therebetween, still more preferably 

EH between about 20- and about 50-fold or any integral value therebetween, more preferably 

N between about 50- and about 1 00-fold or any integral value therebetween, more preferably 

V~ 

y> 15 100-fold or more. 

j~i "Gene repression" and "inhibition of gene expression" refer to any process that results 

5^1 in a decrease in production of a gene product. A gene product can be either RNA (including, 

fU but not limited to, mRNA, rRNA, tRNA, and structural RNA) or protein. Accordingly, gene 

rn repression includes those processes that decrease transcription of a gene and/or translation of 

20 an mRNA. Examples of gene repression processes which decrease transcription include, but 
are not limited to, those which inhibit formation of a transcription initiation complex, those 
which decrease transcription initiation rate, those which decrease transcription elongation rate, 
those which decrease processivity of transcription and those which antagonize transcriptional 
activation (by, for example, blocking the binding of a transcriptional activator). Gene 
25 repression can constitute, for example, prevention of activation as well as inhibition of 
expression below an existing level. Examples of gene repression processes that decrease 
translation include those that decrease translational initiation, those that decrease translational 
elongation and those that decrease mRNA stability. Transcriptional repression includes both 
reversible and irreversible inactivation of gene transcription. In general, gene repression 
30 comprises any detectable decrease in the production of a gene product, preferably a decrease 
in production of a gene product by about 2-fold, more preferably from about 2- to about 5 -fold 
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or any integral value therebetween, more preferably between about 5- and about 10-fold or 
any integral value therebetween, more preferably between about 10- and about 20-fold or any 
integral value therebetween, still more preferably between about 20- and about 50-fold or any 
integral value therebetween, more preferably between about 50- and about 100-fold or any 
5 integral value therebetween, more preferably 100-fold or more. Most preferably, gene 
repression results in complete inhibition of gene expression, such that no gene product is 
detectable. 

The term "modulate" refers to a change in the quantity, degree or extent of a function. 
For example, the modified zinc finger-nucleotide binding polypeptides disclosed herein may 
1 0 modulate the activity of a promoter sequence by binding to a motif within the promoter, 
g thereby inducing, enhancing or suppressing transcription of a gene operatively linked to the 
jy promoter sequence. Alternatively, modulation may include inhibition of transcription of a 

111 gene wherein the modified zinc finger-nucleotide binding polypeptide binds to the structural 
l2 gene and blocks DNA dependent RNA polymerase from reading through the gene, thus 

r "* 15 inhibiting transcription of the gene. The structural gene may be a normal cellular gene or an 
Q oncogene, for example. Alternatively, modulation may include inhibition of translation of a 

ft j transcript. Thus, "modulation" of gene expression includes both gene activation and gene 

repression. 

Fij Modulation can be assayed by determining any parameter that is indirectly or directly 

20 affected by the expression of the target gene. Such parameters include, e.g., changes in RNA 
or protein levels; changes in protein activity; changes in product levels; changes in 
downstream gene expression; changes in transcription or activity of reporter genes such as, for 
example, luciferase, CAT, beta-galactosidase, or GFP (see, e.g., Mistili & Spector, (1997) 
Nature Biotechnology 15:961-964); changes in signal transduction; changes in 
25 phosphorylation and dephosphorylation; changes in receptor-ligand interactions; changes in 
concentrations of second messengers such as, for example, cGMP, cAMP, IP 3 , and Ca2 + ; 
changes in cell growth, changes in neovascularization, and/or changes in any functional effect 
of gene expression. Measurements can be made in vitro, in vivo, and/or ex vivo. Such 
functional effects can be measured by conventional methods, e.g., measurement of RNA or 
30 protein levels, measurement of RNA stability, and/or identification of downstream or reporter 
gene expression. Readout can be by way of, for example, chemiluminescence, fluorescence, 
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colorimetric reactions, antibody binding, inducible markers, ligand binding assays; changes in 
intracellular second messengers such as cGMP and inositol triphosphate (IP 3 ); changes in 
intracellular calcium levels; cytokine release, and the like. 

"Eucaryotic cells" include, but are not limited to, fungal cells (such as yeast), plant 
5 cells, animal cells, mammalian cells and human cells. Similarly, "prokaryotic cells' include, 
but are not limited to, bacteria. 

A "regulatory domain" or "functional domain" refers to a protein or a polypeptide 
sequence that has transcriptional modulation activity, or that is capable of interacting with 
proteins and/or protein domains that have transcriptional modulation activity. Typically, a 
10 functional domain is covalently or non-covalently linked to a ZFP to modulate transcription of 
a gene of interest. Alternatively, a ZFP can act, in the absence of a functional domain, to 
Q modulate transcription. Furthermore, transcription of a gene of interest can be modulated by a 
jX ZFP linked to multiple functional domains. 

HI A "functional fragment" of a protein, polypeptide or nucleic acid is a protein, 

Sj 

M= 1 5 polypeptide or nucleic acid whose sequence is not identical to the full-length protein, 

polypeptide or nucleic acid, yet retains the same function as the full-length protein, 
O polypeptide or nucleic acid. A functional fragment can possess more, fewer, or the same 

: .; 

n ] number of residues as the corresponding native molecule, and/or can contain one ore more 

sas i 

^ amino acid or nucleotide substitutions. Methods for determining the function of a nucleic acid 
HJ20 (e.g., coding function, ability to hybridize to another nucleic acid) are well known in the art. 
Similarly, methods for determining protein function are well known. For example, the DNA- 
binding function of a polypeptide can be determined, for example, by filter-binding, 
electrophoretic mobility-shift, or immunoprecipitation assays. See Ausubel et al y supra. The 
ability of a protein to interact with another protein can be determined, for example, by co- 
25 immunoprecipitation, two-hybrid assays or complementation, both genetic and biochemical. 
See, for example, Fields et at (1989) Nature 340:245-246; U.S. Patent No. 5,585,245 and 
PCT WO 98/44350. 

A "target site" or "target sequence" is a sequence that is bound by a binding protein 
such as, for example, a ZFP. Target sequences can be nucleotide sequences (either DNA or 
30 RNA) or amino acid sequences. By way of example, a DNA target sequence for a three- 
finger ZFP is generally either 9 or 10 nucleotides in length, depending upon the presence 
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and/or nature of cross-strand interactions between the ZFP and the target sequence. Target 
sequences can be found in any DNA or RNA sequence, including regulatory sequences, 
exons, introns, or any non-coding sequence. 

A "target subsite" or "subsite" is the portion of a DNA target site that is bound by a 
5 single zinc finger, excluding cross-strand interactions. Thus, in the absence of cross-strand 
interactions, a subsite is generally three nucleotides in length. In cases in which a cross-strand 
interaction occurs (e.g., a "D-able subsite," as described for example in co-owned PCT WO 
00/42219, incorporated by reference in its entirety herein) a subsite is four nucleotides in 
length and overlaps with another 3- or 4-nucleotide subsite. 
1 0 The term "effective amount" includes that amount which results in the desired result, 

3 for example, deactivation of a previously activated gene, activation of a previously repressed 

fj gene, or inhibition of transcription of a structural gene or translation of RNA. 

£ Zinc Finger Proteins 

15 Zinc finger proteins are formed from zinc finger components. For example, zinc 

f finger proteins can have one to thirty-seven fingers, commonly having 2, 3, 4, 5 or 6 fingers, 

y Zinc finger DNA-binding proteins are described, for example, in Miller et al. (1985) EMBO J. 

t 4:1609-1614; Rhodes et al (1993) Scientific American Feb.:56-65; and Klug (1999) J. Mol 

M Biol 293:215-218. A zinc finger protein recognizes and binds to a target site (sometimes 

20 referred to as a target segment) that represents a relatively small subsequence within a target 
gene. Each component finger of a zinc finger protein binds to a subsite within the target site. 
The subsite includes a triplet of three contiguous bases on the same strand (sometimes 
referred to as the target strand). The three bases in the subsite can be individually denoted the 
5' base, the mid base, and the 3 5 base of the triplet, respectively. The subsite may or may not 
25 also include a fourth base on the non-target strand that is the complement of the base 

immediately 3 5 of the three contiguous bases on the target strand. The base immediately 3' of 
the three contiguous bases on the target strand is sometimes referred to as the V of the 3' 
base. Alternatively, the four bases of the target strand in a four base subsite can be numbered 
4, 3, 2, and 1, respectively, starting from the 5 1 base. 
30 In discussing the specificity-determining regions of a zinc finger, amino acid +1 refers 

to the first amino acid in the a-helical portion of the zinc finger. The portion of a zinc finger 
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that is generally believed to be responsible for its binding specificity lies between -1 and +6. 
Amino acid ++2 refers to the amino acid at position +2 in a second zinc finger adjacent (in the 
C-terminal direction) to the zinc finger under consideration. In certain circumstances, a zinc 
finger binds to its triplet subsite substantially independently of other fingers in the same zinc 
finger protein. Accordingly, the binding specificity of a zinc finger protein containing 
multiple fingers is, to a first approximation, the aggregate of the specificities of its component 
fingers. For example, if a zinc finger protein is formed from first, second and third fingers 
that individually bind to triplets XXX, YYY, and ZZZ, the binding specificity of the zinc 
finger protein is 3 '-XXX YYY ZZZ-5'. 

The relative order of fingers in a zinc finger protein, from N-terminal to C-terminal, 
determines the relative order of triplets in the target sequence, in the 3' to 5' direction that will 
be recognized by the fingers. For example, if a zinc finger protein comprises, from N- 
terminal to C-terminal, first, second and third fingers that individually bind to the triplets 
5'-GAC-3', 5'-GTA-3' and 5'-GGC-3', respectively, then the zinc finger protein binds to the 
target sequence 5'-GGCGTAGAC-3' (SEQ ID NO: 3). If the zinc finger protein comprises 
the fingers in another order, for example, second finger, first finger, third finger, then the zinc 
finger protein binds to a target segment comprising a different permutation of triplets, in this 
example, 5'-GGCGACGTA-3' (SEQ ID NO: 4). See Berg et al. (1996) Science 271:1081- 
1086. 

A component finger of a zinc finger protein typically contains approximately 30 amino 
acids and comprises the following canonical consensus sequence (from N to C): 
Cys-(X) 2 ^-Cys-X12-His-(X) 3 .5-His (SEQ ID NO: 2) 

Thus, most C2H2 type zinc fingers contain two invariant cysteine residues in the 
beta turn and two invariant histidine residues, these four residues being coordinated 
through a zinc atom to maintain the characteristic zinc finger structure. See, e.g., Berg & 
Shi (1996) Science 271:1081-1085. The numbering convention used above is standard in 
the field for the region of a zinc finger conferring binding specificity. The amino acid on 
the N-terminal side of the first invariant His residue is assigned the number +6, and other 
amino acids, proceeding in an N-terminal direction, are assigned successively decreasing 
numbers. The alpha helix begins at residue +1 and extends to the residue following the 
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second conserved histidine. The entire helix is therefore of variable length, between 1 1 
and 13 residues. 

Certain DNA-binding domains are capable of binding to DNA that is packaged in 
nucleosomes. See, for example, Cordingley et al (1987) Cell 48:261-270; Pina et al (1990) 
5 Cell 60:719-731; and Cirillo et al (1998) EMBO 1 17:244-254. Certain ZFP-containing 
proteins such as, for example, members of the nuclear hormone receptor superfamily, are 
capable of binding DNA sequences packaged into chromatin. These include, but are not 
limited to, the glucocorticoid receptor and the thyroid hormone receptor. Archer et al (1992) 
Science 255:1573-1576; Wong et al (1997) EMBO J. 16:7130-7145. Other DNA-binding 
10 domains, including certain ZFP-containing binding domains, require more accessible DNA 
H for binding. In the latter case, the required binding specificity of the DNA-binding domain 
f% can be determined by identifying accessible regions in the cellular chromatin. Accessible 

"j'l regions can be determined as described in co-owned International Publications WO 01/83751 

Ul 

S3 and WO 01/83732, the disclosures of which are hereby incorporated by reference herein. A 

y : 15 modified ZFP DNA-binding domain is designed and/or selected to bind to a target site within 

* the accessible region. 

5 >i 

ru 

I£j A. Non-Canonical ZFPs 

6 The compositions and methods disclosed herein include modified, preferably non- 
20 canonical (e.g., non-C2H2), zinc finger proteins that specifically bind to a target sequence. 

Non-canonical ZFP DNA-binding domains can be designed and/or selected to recognize a 
particular target site, for example as described in co-owned WO 00/42219; WO 00/41566; as 
well as U.S. Patents 5,789,538; 6,007,408; 6,013,453; 6,140,081 and 6,140,466; andPCT 
publications WO 95/19431, WO 98/5431 1, WO 00/23464 and WO 00/27878. In preferred 

25 embodiments, the process of designing or selecting a non-canonical, non-naturally occurring 
ZFP typically starts with a natural ZFP as a source of framework residues, as described in co- 
owned PCT WO 00/42219; WO 98/53057; WO 98/53058; WO 98/53059 and WO 98/53060. 

Briefly, the methods disclosed herein serve to modify the typically invariant Cys and 
His residues while maintaining (or enhancing) the desired binding specificity of a ZFP. The 

30 process of obtaining a non-naturally occurring ZFP with a predetermined binding specificity 
typically starts with a natural ZFP as a source of framework residues. The process of design 
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or selection serves to define non-conserved positions (i.e., positions -1 to +6) so as to confer a 

desired binding specificity. One ZFP suitable for use as a framework is the DNA-binding 

domain of the mouse transcription factor Zi£268. Another suitable natural zinc finger protein 

as a source of framework residues is Sp-1. The Sp-1 sequence used for construction of zinc 

finger proteins corresponds to amino acids 53 1 to 624 in the Sp-1 transcription factor. An 

additional useful ZFP backbone is that of the Sp-1 consensus sequence, described by Shi et al. 

(1995) Chemistry and Biology 1:83-89. The amino acid sequences of these ZFP frameworks 

are disclosed in co-owned PCT WO 00/42219, the disclosure of which is incorporated by 

reference. In other aspects, the ZFP backbone will comprise a modified plant ZFP backbone 

into which one or more of the non-canonical fingers described herein are inserted so that they 

bind to a target sequence. Other suitable ZFPs are known to those of skill in the art and are 

described herein. The documents cited supra also disclose methods of assessing binding 

specificity of modified ZFPs. 

Non-canonical zinc fingers therefore include one or more zinc finger components 

in which at least one of the C2H2 amino acids has been replaced with one or more amino 

acids. In certain embodiments, more than one of the canonical amino acids is replaced. 

Examples of non-canonical zinc finger components include: 

X 3 -B-X2-4-Cys-Xi2-His-Xi- 7 -His-X4 

X 3 -Cys-X 2 ^B-Xi 2 -His-Xi.7-His-X4 

X 3 -Cys-X2.4-Cys-Xi2-Z-X 1 _7-His-X 4 

X 3 -Cys-X 2 .4-Cys-Xi2-His-X 1 -7-Z-X 4 

X 3 -B-X2-4-B-X ir His-Xi_7-His-X4 

X 3 -B-X 2 ^Cys-Xi2-Z-X 1 . 7 -His-X 4 

X 3 -B-X 2 _ 4 -Cys-Xi2-His-X 1 . 7 -Z-X 4 

X 3 -Cys-X 2 -4-B-Xi2-Z-Xi_ 7 -His-X4 

X 3 -Cys-X 2 -4-B-X 1 2-His-X 1 -7-Z-X 4 

X 3 -Cys-X 2 -4-Cys-Xi 2 -Z-Xi_7-Z-X4 

X 3 -Cys-.X 2 . 4 -B-XirZ-X 1 . 7 -Z-X 4 

X 3 -B-X 2 - 4 -Cys-XirZ-X 1 _7-Z-X 4 

X 3 -B-X 2 -4-B-Xi2-His-Xi. 7 -Z-X4 

X 3 -B-X 2 - 4 -B-Xi 2 -Z-Xi- 7 -His-X4 

X 3 -B-X 2 . 4 -B-Xi 2 -Z-Xi_7-Z-X4 

X3-Y-X2-4-Cys-Xi2-His-X,. 7 -His-X4 

X 3 -Cys-X 2 ^-Y-Xi 2 -His-Xi. 7 -His-X4 

X 3 -Cys-X 2 -4-Cys-X 12 -Y-X 1 . 7 -His-X 4 

X3-Cys-X 2 ^-Cys-X 12 -His-Xi-7-Y-X4 

Xs-Y-X^-Y-Xn-His-Xi.T-His^ 

X 3 -Y-X 2 . 4 -Cys-X 12 -Y-Xi. 7 -His-X4 
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X 3 -Y-X2-4-Cys-Xi2-His-Xi. 7 -Y-X4 

X 3 -Cys-X 2 ^Y-Xi2-Y-Xi.7-His-X4 

X 3 -Cys-X2^Y-Xi2-His-Xi_7-Y-X 4 

X 3 -Cys-X2-4-Cys-Xi 2 -Y-Xi.7-Y-X 4 

X3-Cys-X2.4-Y-X12-Y-XL7-Y.X4 

X 3 -Y-X M -Cys.Xi2-Y-Xi. 7 -Y-X4 

X3-Y-X2.4-Y-X12-His-X1.7-Y.X4 

X 3 -Y-X2- 4 -Y-Xi2-Y-Xi_7-His-X 4 

X 3 -Y-X 2 - 4 -Y-X 12 -Y-Xi.7-Y-X4 



where X= any amino acid 

B= any amino acid except cysteine 
Z= any amino acid except histidine 
15 Y= any amino acid except histidine or cysteine 



A modified ZFP can include any number of zinc finger components, although a three- 
\ finger structure is generally preferred. Typically, the C-terminal-most (e.g., third) finger of 
} the ZFP is modified and non-canonical. The other fingers of the protein may be naturally 
& 20 occurring zinc finger components, non-canonical modified components, modified C2H2 
fingers or combinations of these components. Thus, as described below in Example 2, in 
certain embodiments, a three- finger zinc finger binding protein is provided wherein the first 
two fingers are of the C2-H2 class and, in the third (C-terminal-most) finger, the second 
histidine is substituted with Cys or with Cys and additional amino acids, such as glycine. In 
PJ25 other embodiments, a three-finger zinc finger protein is provided wherein, in the first (N- 

terminal-most) finger, the first cysteine residue is substituted with histidine or with histidine 
and additional amino acids such as glycine. Furthermore, in certain embodiments, the second 
(middle) finger of a three-finger ZFP is modified such that one or both of the cysteines are 
replaced with serines (and/or additional amino acids). 
30 Also included herein are nucleic acids encoding a ZFP comprising at least one non- 

canonical zinc finger as described herein. 



B. Linkage 

Two or more zinc finger proteins can be linked to have a target site specificity that is, 
35 to a first approximation, the aggregate of that of the component zinc finger proteins. For 
example, a first zinc finger protein having first, second and third component fingers that 
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respectively bind to XXX, YYY and ZZZ can be linked to a second zinc finger protein having 
first, second and third component fingers with binding specificities, AAA, BBB and CCC. 
The binding specificity of the combined first and second proteins is thus 
5'-CCCBBBAAANZZZYYYXXX-3', where N indicates a short intervening region (typically 
0-5 bases of any type). In this situation, the target site can be viewed as comprising two target 
segments separated by an intervening segment. 

Linkage of zinc finger proteins can be accomplished using any of the following 
peptide linkers: 

TGEKP (SEQ ID NO: 5) Liu et al (1997) Proc. Natl. Acad. Sci. USA 94:5525-5530. 
(G 4 S) n (SEQ ID NO: 6) Kim et al (1996) Proc. Natl Acad Set USA 93:1 156-1 160. 
GGRRGGGS (SEQ ID NO: 7) 
LRQRDGERP (SEQ ID NO: 8) 
LRQKDGGGSERP (SEQ ID NO: 9) 
LRQKD(G 3 S) 2 ERP (SEQ ID NO: 10). 

Alternatively, flexible linkers can be rationally designed using computer programs 
capable of modeling both DNA-binding sites and the peptides themselves, or by phage display 
methods. In a further variation, non-covalent linkage can be achieved by fusing two zinc 
finger proteins with domains promoting heterodimer formation of the two zinc finger proteins. 
For example, one zinc finger protein can be fused with fas and the other with jun (see Barbas 
et al, WO 95/119431). Alternatively, dimerization interfaces can be obtained by selection. 
See, for example, Wang et al (1999) Proc. Natl Acad Sci. USA 96:9568-9573. 

Linkage of two or more zinc finger proteins is advantageous for conferring a unique 
binding specificity within a mammalian genome. A typical mammalian diploid genome 
consists of 3 x 10 9 bp. Assuming that the four nucleotides A, C, G, and T are randomly 
distributed, a given 9 bp sequence is present -23,000 times. Thus a three-finger ZFP 
recognizing a 9 bp target with absolute specificity would have the potential to bind to -23,000 
sites within the genome. An 18 bp sequence is present once in 3.4 x 10 10 bp, or about once in 
a random DNA sequence whose complexity is ten times that of a mammalian genome. Thus, 
linkage of two three-finger ZFPs, to recognize an 18 bp target sequence, provides the requisite 
specificity to target a unique site in a typical mammalian genome. 
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C. Fusion Molecules 

The selection and/or design of non-canonical zinc finger-containing proteins also 
allows for the design of fusion molecules that facilitate regulation of gene expression. 
Thus, in certain embodiments, the compositions and methods disclosed herein involve 
5 fusions between at least one of the zinc finger proteins described herein (or functional 
fragments thereof) and one or more functional domains (or functional fragments thereof), 
or a polynucleotide encoding such a fusion. The presence of such a fusion molecule in a 
cell allows a functional domain to be brought into proximity with a sequence in a gene 
that is bound by the zinc finger portion of the fusion molecule. The transcriptional 
10 regulatory function of the functional domain is then able to act on the gene, by, for 
J example, modulating expression of the gene. 

□ In certain embodiments, fusion proteins comprising a modified zinc finger DNA- 

m binding domain and a functional domain are used for modulation of endogenous gene 
%i expression as described, for example, in co-owned PCT WO 00/41566. Modulation 
1 5 includes repression and activation of gene expression; the nature of the modulation 
' n generally depending on the type of functional domain present in the fusion protein. Any 
polypeptide sequence or domain capable of influencing gene expression (or functional 
fragment thereof) that can be fused to a DNA-binding domain, is suitable for use. 

An exemplary functional domain for fusing with a ZFP DNA-binding domain, to 
20 be used for repressing gene expression, is a KRAB repression domain from the human 
KOX-1 protein (see, e.g., Thiesen et al., New Biologist 2, 363-374 (1990); Margolin et 
al., Proc. Natl. Acad. Sci. USA 91, 4509-4513 (1994); Pengue et al., Nucl. Acids Res. 
22:2908-2914 (1994); Witzgall et al., Proc. Natl. Acad. Sci. USA 91, 4514-4518 (1994). 
Another suitable repression domain is methyl binding domain protein 2B (MBD-2B) 
25 (see, also Hendrich et al. (1999) Mamm Genome 10:906-912 for description of MBD 
proteins). Another useful repression domain is that associated with the v-ErbA protein. 
See, for example, Damm, et al. (1989) Nature 339:593-597; Evans (1989) Int. J. Cancer 
Suppl. 4:26-28; Pain et al. (1990) New Biol. 2:284-294; Sap et al. (1989) Nature 
340:242-244; Zenke et al. (1988) Cell 52:107-119; and Zenke et al. (1990) Cell 
30 61 : 1035-1049. Additional exemplary repression domains include, but are not limited to, 
thyroid hormone receptor (TR), SID, MBD1, MBD2, MBD3, MBD4, MBD-like proteins, 
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members of the DNMT family (e.g., DNMT1, DNMT3A, DNMT3B), Rb, MeCPl and 
MeCP2. See, for example, Zhang et al. (2000) Ann Rev Physiol 62:439-466; Bird et al. 
(1999) Cell 99:451-454; Tylers al. (1999) Cell 99:443-446; Knoepfler et al. (1999) 
Cell 99:447-450; and Robertson et al. (2000) Nature Genet. 25:338-342. Additional 
5 exemplary repression domains include, but are not limited to, ROM2 and AtHD2A. See, 
for example, Chern et al. (1996) Plant Cell 8:305-321; and Wu et al. (2000) Plant J. 
11-A9-21. 

Suitable domains for achieving activation include the HSV VP 16 activation 
domain (see, e.g., Hagmann et al., J. Virol. 71, 5952-5962 (1997)) nuclear hormone 
10 receptors (see, e.g., Torchia et al., Curr. Opin. Cell. Biol. 10:373-383 (1998)); the p65 
subunit of nuclear factor kappa B (Bitko & Barik, J. Virol. 72:5610-5618 (1998) and 
Doyle & Hunt, Neuroreport 8:2937-2942 (1997)); Liu et al., Cancer Gene Ther. 5:3-28 
!ii (1 998)), or artificial chimeric functional domains such as VP64 (Seifpal et al., EMBO J. 
H 11,4961-4968(1992)). 

\& 1 5 Additional exemplary activation domains include, but are not limited to, p300, 

L CBP, PCAF,SRC 1 PvALF, AtHD2A and ERF-2. See, for example, Robyr et al. (2000) 
M* Mol. Endocrinol. 14:329-347; Collingwood et al. (1999) J. Mol. Endocrinol. 23:255- 
275; Leo et al. (2000) Gene 245:1-1 1; Manteuffel-Cymborowska (1999) Acta Biochim. 
| Pol. 46:77-89; McKenna et al. (1999) J. Steroid Biochem. Mol. Biol. 69:3-12; Malik et 
20 al. (2000) Trends Biochem. Sci. 25:277-283; and Lemon et al. (1999) Curr. Opin. Genet. 
Dev. 9:499-504. Additional exemplary activation domains include, but are not limited to, 
OsGAI, HALF-1, CI, API, ARF-5, -6, -7, and -8, CPRF1, CPRF4, MYC-RP/GP, and 
TRAB1. See, for example, Ogawa et al. (2000) Gene 245:21-29; Okanami et al. (1996) 
Genes Cells 1:87-99; Goff et al. (1991) Genes Dev. 5:298-309; Cho et al. (1999) Plant 
25 Mol. Biol. 40:419-429; Ulmason et al. (1999) Proc. Natl. Acad. Sci. USA 96:5844-5849; 
Sprenger-Haussels et al. (2000) Plant J. 22:1-8; Gong et al. (1999) Plant Mol. Biol. 
41:33-44; and Hobo et al. (1999) Proc. Natl. Acad. Sci. USA 96:15,348-15,353. 

Additional functional domains are disclosed, for example, in co-owned 
WO 00/41566. Further, insulator domains, chromatin remodeling proteins such as ISWI- 
30 containing domains and/or methyl binding domain proteins suitable for use in fusion 
molecules are described, for example, in co-owned International Publication WO 
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01/83793 andPCT/USOl/42377. 

In additional embodiments, targeted remodeling of chromatin, as disclosed in co- 
owned International patent publication WO 01/83793 can be used to generate one or 
more sites in cellular chromatin that are accessible to the binding of a functional 
domain/modified ZFP fusion molecule. 

Fusion molecules are constructed by methods of cloning and biochemical 
conjugation that are well known to those of skill in the art. Fusion molecules comprise a 
modified ZFP binding domain and, for example, a transcriptional activation domain, a 
transcriptional repression domain, a component of a chromatin remodeling complex, an 
insulator domain or a functional fragment of any of these domains. In certain 
embodiments, fusion molecules comprise a non-canonical zinc finger protein and at least 
two functional domains (e.g., an insulator domain or a methyl binding protein domain 
and, additionally, a transcriptional activation or repression domain). Fusion molecules 
also optionally comprise nuclear localization signals (such as, for example, that from the 
SV40 medium T-antigen) and epitope tags (such as, for example, FLAG, see Example 2, 
and hemagglutinin). Fusion proteins (and nucleic acids encoding them) are designed 
such that the translational reading frame is preserved among the components of the 
fusion. 

The fusion molecules disclosed herein comprise a non-canonical zinc finger 
binding protein which binds to a target site. In certain embodiments, the target site is 
present in an accessible region of cellular chromatin. Accessible regions can be 
determined as described in co-owned International PCT Publications WO 01/83751 and 
WO 01/83732. If the target site is not present in an accessible region of cellular 
chromatin, one or more accessible regions can be generated as described, for example, 
in co-owned International PCT Publication WO 01/83793. 

In additional embodiments, the non-canonical zinc finger component of a fusion 
molecule is capable of binding to cellular chromatin regardless of whether its target site is 
in an accessible region or not. For example, a modified ZFP as disclosed herein can be 
capable of binding to linker DNA and/or to nucleosomal DNA. Examples of this type of 
"pioneer" DNA binding domain are found in certain steroid receptor and in hepatocyte 
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nuclear factor 3 (HNF3). Cordingley ^ a/. (1987) Cfefl 48:261-270; Pina efo/. (1990) 
Cell 60:719-731; and Cirillo et al (1998) EMBO J. 17:244-254. 

Methods of gene regulation using a functional domain, targeted to a specific 
sequence by virtue of a fused DNA binding domain, can achieve modulation of gene 
5 expression. Genes so modulated can be endogenous genes or exogenous genes. 
Modulation of gene expression can be in the form of repression (e.g., repressing 
expression of exogenous genes, for example, when the target gene resides in a 
pathological infecting microorganism, or repression of an endogenous gene of the 
subject, such as an oncogene or a viral receptor, that contributes to a disease state). As 
10 described herein, repression of a specific target gene can be achieved by using a fusion 
molecule comprising a non-canonical zinc finger protein and a functional domain. 

Alternatively, modulation can be in the form of activation, if activation of a gene 
(e.g., a tumor suppressor gene or a transgene) can ameliorate a disease state. In this case, 
cellular chromatin is contacted with any of the fusion molecules described herein, 
1 5 wherein the modified zinc finger portion of the fusion molecule is specific for the target 
gene. The functional domain (e.g., insulator domain, activation domain, etc.) enables 
increased and/or sustained expression of the target gene. 

For any such applications, the fusion molecule(s) can be formulated with a 
pharmaceutically acceptable carrier, as is known to those of skill in the art. See, for 
^20 example, Remington's Pharmaceutical Sciences, 17 th ed., 1985; and co-owned WO 
00/42219. 

Polynucleotide and Polypeptide Delivery 

The compositions described herein can be provided to the target cell in vitro or in 
25 vivo. In addition, the compositions can be provided as polypeptides, polynucleotides or 
combination thereof. 

A. Delivery of Polynucleotides 

In certain embodiments, the compositions are provided as one or more 
30 polynucleotides. Further, as noted above, a non-canonical zinc finger protein-containing 
composition can be designed as a fusion between a polypeptide zinc finger and a 
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functional domain that is encoded by a fusion nucleic acid. In both fusion and non-fusion 
cases, the nucleic acid can be cloned into intermediate vectors for transformation into 
prokaryotic or eukaryotic cells for replication and/or expression. Intermediate vectors for 
storage or manipulation of the nucleic acid or production of protein can be prokaryotic 
5 vectors, (e.g. 9 plasmids), shuttle vectors, insect vectors, or viral vectors for example. A 
nucleic acid encoding a non-canonical zinc finger protein can also cloned into an 
expression vector, for administration to a bacterial cell, fungal cell, protozoal cell, plant 
cell, or animal cell, preferably a mammalian cell, more preferably a human cell. 

To obtain expression of a cloned nucleic acid, it is typically subcloned into an 
10 expression vector that contains a promoter to direct transcription. Suitable bacterial and 
Q eukaryotic promoters are well known in the art and described, e.g. , in Sambrook et al , 

supra; Ausubel et al, supra; and Kriegler, Gene Transfer and Expression: A 
V] Laboratory Manual (1990). Bacterial expression systems are available in, e.g., E. coli, 

u, Bacillus sp., and Salmonella. Palva et al. (1983) Gene 22:229-235. Kits for such 
^ 15 expression systems are commercially available. Eukaryotic expression systems for 
Q mammalian cells, yeast, and insect cells are well known in the art and are also 
jy commercially available, for example, from Invitrogen, Carlsbad, CA and Clontech, Palo 
g Alto, CA. 

f U The promoter used to direct expression of the nucleic acid of choice depends on 

20 the particular application. For example, a strong constitutive promoter is typically used 
for expression and purification. In contrast, when a protein is to be used in vivo, either a 
constitutive or an inducible promoter is used, depending on the particular use of the 
protein. In addition, a weak promoter can be used, such as HSV TK or a promoter having 
similar activity. The promoter typically can also include elements that are responsive to 
25 transactivation, e.g. , hypoxia response elements, Gal4 response elements, lac repressor 
response element, and small molecule control systems such as tet-regulated systems and 
the RU-486 system. See, e.g., Gossen et al. (1992) Proc. Natl. Acad. Sci USA 89:5547- 
5551; Oligino et a/.(1998) Gene Ther. 5:491-496; Wang et al. (1997) Gene Ther. 4:432- 
441; Neering^a/. (1996) Blood 88:1 147-1155; and Rendahl et al. (1998) Nat. 
30 Biotechnol. 16:757-761. 
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In addition to a promoter, an expression vector typically contains a transcription 
unit or expression cassette that contains additional elements required for the expression of 
the nucleic acid in host cells, either prokaryotic or eukaryotic. A typical expression 
cassette thus contains a promoter operably linked, e.g., to the nucleic acid sequence, and 
5 signals required, e.g., for efficient polyadenylation of the transcript, transcriptional 

termination, ribosome binding, and/or translation termination. Additional elements of the 
cassette may include, e.g., enhancers, and heterologous spliced intronic signals. 

The particular expression vector used to transport the genetic information into the 
cell is selected with regard to the intended use of the resulting ZFP polypeptide, e.g., 
10 expression in plants, animals, bacteria, fungi, protozoa etc. Standard bacterial expression 
, vectors include plasmids such as pBR322, pBR322-based plasmids, pSKF, pET23D, and 

CI commercially available fusion expression systems such as GST and LacZ. Epitope tags 
m can also be added to recombinant proteins to provide convenient methods of isolation, for 
y { monitoring expression, and for monitoring cellular and subcellular localization, e.g., 
M* 1 5 c-myc or FLAG. 

Expression vectors containing regulatory elements from eukaryotic viruses are 
p often used in eukaryotic expression vectors, e.g., SV40 vectors, papilloma virus vectors, 
flj and vectors derived from Epstein-Barr virus. Other exemplary eukaryotic vectors include 
g pMSG, pAV009/A+, pMTO10/A+, pMAMneo-5, baculovirus pDSVE, and any other 
ftUO vector allowing expression of proteins under the direction of the SV40 early promoter, 
SV40 late promoter, metallothionein promoter, murine mammary tumor virus promoter, 
Rous sarcoma virus promoter, polyhedrin promoter, or other promoters shown effective 
for expression in eukaryotic cells. 

Some expression systems have markers for selection of stably transfected cell 
25 lines such as thymidine kinase, hygromycin B phosphotransferase, and dihydrofolate 

reductase. High-yield expression systems are also suitable, such as baculovirus vectors in 
insect cells, with a nucleic acid sequence coding for a ZFP as described herein under the 
transcriptional control of the polyhedrin promoter or any other strong baculovirus 
promoter. 

30 Elements that are typically included in expression vectors also include a replicon 

that functions in E. coli (or in the prokaryotic host, if other than E. coli), a selective 
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marker, e.g., a gene encoding antibiotic resistance, to permit selection of bacteria that 
harbor recombinant plasmids, and unique restriction sites in nonessential regions of the 
vector to allow insertion of recombinant sequences. 

Standard transfection methods can be used to produce bacterial, mammalian, 

5 yeast, insect, or other cell lines that express large quantities of non-canonical zinc finger 
proteins, which can be purified, if desired, using standard techniques. See, e.g., Colley et 
al. (1989)/. Biol. Chem. 264:17619-17622; and Guide to Protein Purification, in 
Methods in Enzymology, vol. 182 (Deutscher, ed.) 1990. Transformation of eukaryotic 
and prokaryotic cells is performed according to standard techniques. See, e.g., Morrison 

1 0 (1 977) Bacteriol 132:349-35 1 ; Clark-Curtiss et al. (1 983) in Methods in Enzymology 
101:347-362 (Wu et al, eds). 

Any procedure for introducing foreign nucleotide sequences into host cells can be 
used. These include, but are not limited to, the use of calcium phosphate transfection, 
DEAE-dextran-mediated transfection, polybrene, protoplast fusion, electroporation, lipid- 

15 mediated delivery {e.g., liposomes), microinjection, particle bombardment, introduction 
of naked DNA, plasmid vectors, viral vectors (both episomal and integrative) and any of 
the other well known methods for introducing cloned genomic DNA, cDNA, synthetic 
DNA or other foreign genetic material into a host cell (see, e.g., Sambrook et al, supra). 
It is only necessary that the particular genetic engineering procedure used be capable of 

20 successfully introducing at least one gene into the host cell capable of expressing the 
protein of choice. 

Conventional viral and non- viral based gene transfer methods can be used to 
introduce nucleic acids into mammalian cells or target tissues. Such methods can be used 
to administer nucleic acids encoding reprogramming polypeptides to cells in vitro. 

25 Preferably, nucleic acids are administered for in vivo or ex vivo gene therapy uses. Non- 
viral vector delivery systems include DNA plasmids, naked nucleic acid, and nucleic acid 
complexed with a delivery vehicle such as a liposome. Viral vector delivery systems 
include DNA and RNA viruses, which have either episomal or integrated genomes after 
delivery to the cell. For reviews of gene therapy procedures, see, for example, Anderson 

30 (1992) Science 256:808-813; Nabel^a/. (1993) Trends Biotechnol 11:211-217; Mitani 
etal (1993) Trends Biotechnol. 11:162-166; Dillon (1993) Trends Biotechnol. 11:167- 
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175; Miller (1992) Nature 357:455-460; Van Brunt (1988) Biotechnology 6(10): 1 149- 
1 154; Vigne (1995) Restorative Neurology and Neuroscience 8:35-36; Kremer et al 
(1995) British Medical Bulletin 51(l):31-44; Haddada et al, in Current Topics in 
Microbiology and Immunology, Doerfler and Bohm (eds), 1995; and Yu et al (1994) 
Gene Therapy 1:13-26. 

Methods of non- viral delivery of nucleic acids include lipofection, microinjection, 
ballistics, virosomes, liposomes, immunoliposomes, polycation or lipid:nucleic acid 
conjugates, naked DNA, artificial virions, and agent-enhanced uptake of DNA. 
Lipofection is described in, e.g., U.S. Patent Nos. 5,049,386; 4,946,787; and 4,897,355 
and lipofection reagents are sold commercially (e.g., Transfectam™ and Lipofectin™). 
Cationic and neutral lipids that are suitable for efficient receptor-recognition lipofection 
of polynucleotides include those of Feigner, WO 91/17424 and WO 91/16024. Nucleic 
acid can be delivered to cells {ex vivo administration) or to target tissues (in vivo 
administration). 

The preparation of lipidmucleic acid complexes, including targeted liposomes 
such as immunolipid complexes, is well known to those of skill in the art. See, e.g. , 
Crystal (1995) Science 270:404-410; Blaese et al (1995) Cancer Gene Ther. 2:291-297; 
Behr et al (1994) Bioconjugate Chem. 5:382-389; Remy et al (1994) Bioconjugate 
Chem. 5:647-654; Gao et al (1995) Gene Therapy 2:710-722; Ahmad et al (1992) 
Cancer Res. 52:4817-4820; and U.S. Patent Nos. 4,186,183; 4,217,344; 4,235,871; 
4,261,975; 4,485,054; 4,501,728; 4,774,085; 4,837,028 and 4,946,787. 

The use of RNA or DNA virus-based systems for the delivery of nucleic acids 
take advantage of highly evolved processes for targeting a virus to specific cells in the 
body and trafficking the viral payload to the nucleus. Viral vectors can be administered 
directly to patients (in vivo) or they can be used to treat cells in vitro, wherein the 
modified cells are administered to patients (ex vivo). Conventional viral based systems 
for the delivery of ZFPs include retroviral, lentiviral, poxviral, adenoviral, adeno- 
associated viral, vesicular stomatitis viral and herpesviral vectors. Integration in the host 
genome is possible with certain viral vectors, including the retrovirus, lentivirus, and 
adeno-associated virus gene transfer methods, often resulting in long term expression of 
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the inserted transgene. Additionally, high transduction efficiencies have been observed 
in many different cell types and target tissues. 

The tropism of a retrovirus can be altered by incorporating foreign envelope 
proteins, allowing alteration and/or expansion of the potential target cell population. 
Lentiviral vectors are retroviral vector that are able to transduce or infect non-dividing 
cells and typically produce high viral titers. Selection of a retroviral gene transfer system 
would therefore depend on the target tissue. Retroviral vectors have a packaging capacity 
of up to 6-10 kb of foreign sequence and are comprised of ds-acting long terminal 
repeats (LTRs). The minimum m-acting LTRs are sufficient for replication and 
packaging of the vectors, which are then used to integrate the therapeutic gene into the 
target cell to provide permanent transgene expression. Widely used retroviral vectors 
include those based upon murine leukemia virus (MuLV), gibbon ape leukemia virus 
(GaLV), simian immunodeficiency virus (SIV), human immunodeficiency virus (HIV), 
and combinations thereof. Buchscher et ah (1992) J, Virol. 66:2731-2739; Johanna ah 
(1992) J. Virol 66:1635-1640; Sommerfelt et ah (1990) Virol. 176:58-59; Wilson ah 
(1989) J. Viroh 63:2374-2378; Miller etah (1991) J. Virol. 65:2220-2224; and 
PCT/US94/05700). 

Adeno-associated virus (AAV) vectors are also used to transduce cells with target 
nucleic acids, e.g., in the in vitro production of nucleic acids and peptides, and for in vivo 
and ex vivo gene therapy procedures. See, e.g., West et ah (1987) Virology 160:38-47; 
U.S. Patent No. 4,797,368; WO 93/24641; Kotin(1994)fli*w. Gene Ther. 5:793-801; 
and Muzyczka (1994) J. Clin. Invest. 94:1351. Construction of recombinant AAV 
vectors are described in a number of publications, including U.S. Patent No. 5,173,414; 
Tratschin^a/. (1985) Moh Cell. Biol 5:3251-3260; Tratschin, etah (1984) Moh Cell. 
Biol. 4:2072-2081; Hermonat et ah (1984) Proc. Natl Acad, Sci. USA 81:6466-6470; 
andSamulski^a/. (1989) J. Viroh 63:3822-3828. 

Recombinant adeno-associated virus vectors based on the defective and 
nonpathogenic parvovirus adeno-associated virus type 2 (AAV-2) are a promising gene 
delivery system. Exemplary AAV vectors are derived from a plasmid containing the 
AAV 145 bp inverted terminal repeats flanking a transgene expression cassette. Efficient 
gene transfer and stable transgene delivery due to integration into the genomes of the 
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transduced cell are key features for this vector system. Wagner et al (1998) Lancet 
351©(9117):1702-3; andKearns etal. (1996) Gene Ther. 9:748-55. 

pLASN and MFG-S are examples are retroviral vectors that have been used in clinical 
trials. Dunbar etaL (1995) Blood 85:3048-305; Kohnetal (1995) Nature Med. 1:1017-102; 
5 Malech et al. (1997) Proc. Natl Acad. ScL USA 94:12133-12138. PA317/pLASN was the 
first therapeutic vector used in a gene therapy trial. (Blaese et al. (1995) Science 270:475-480. 
Transduction efficiencies of 50% or greater have been observed for MFG-S packaged vectors. 
Ellern et al (1997) Immunol Immunother. 44(1): 10-20; Dranoff^ al (1997) Hum. Gene 
Ther. 1:111-2. 

1 0 In applications for which transient expression is preferred, adenoviral-based 

systems are useful. Adenoviral based vectors are capable of very high transduction 
O efficiency in many cell types and are capable of infecting, and hence delivering nucleic 
j*s acid to, both dividing and non-dividing cells. With such vectors, high titers and levels of 
N expression have been obtained. Adenovirus vectors can be produced in large quantities 
M: 15 in a relatively simple system. 

Replication-deficient recombinant adenovirus (Ad) vectors can be produced at 
H* high titer and they readily infect a number of different cell types. Most adenovirus 
pj vectors are engineered such that a transgene replaces the Ad El a, Elb, and/or E3 genes; 
5i the replication defector vector is propagated in human 293 cells that supply the required 
20 El functions in trans. Ad vectors can transduce multiple types of tissues in vivo, 

including non-dividing, differentiated cells such as those found in the liver, kidney and 
muscle. Conventional Ad vectors have a large carrying capacity for inserted DNA. An 
example of the use of an Ad vector in a clinical trial involved polynucleotide therapy for 
antitumor immunization with intramuscular injection. Sterman et al (1998) Hum. Gene 
25 Ther. 7:1083-1089. Additional examples of the use of adenovirus vectors for gene 

transfer in clinical trials include Rosenecker et al. (1996) Infection 24:5-10; Sterman et 
al, supra; Welsh et al. (1995) Hum. Gene Ther. 2:205-218; Alvarez et al (1997) Hum. 
Gene Ther. 5:597-613; and Topf et al (1998) Gene Ther. 5:507-513. 

Packaging cells are used to form virus particles that are capable of infecting a host 
30 cell. Such cells include 293 cells, which package adenovirus, and ^2 cells or PA317 
cells, which package retroviruses. Viral vectors used in gene therapy are usually 
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generated by a producer cell line that packages a nucleic acid vector into a viral particle. 
The vectors typically contain the minimal viral sequences required for packaging and 
subsequent integration into a host, other viral sequences being replaced by an expression 
cassette for the protein to be expressed. Missing viral functions are supplied in trans, if 
5 necessary, by the packaging cell line. For example, AAV vectors used in gene therapy 
typically only possess ITR sequences from the AAV genome, which are required for 
packaging and integration into the host genome. Viral DNA is packaged in a cell line, 
which contains a helper plasmid encoding the other AAV genes, namely rep and cap, but 
lacking ITR sequences. The cell line is also infected with adenovirus as a helper. The 
10 helper virus promotes replication of the AAV vector and expression of AAV genes from 
the helper plasmid. The helper plasmid is not packaged in significant amounts due to a 
lack of ITR sequences. Contamination with adenovirus can be reduced by, e.g., heat 
treatment, which preferentially inactivates adenoviruses. 

In many gene therapy applications, it is desirable that the gene therapy vector be 
H ! 1 5 delivered with a high degree of specificity to a particular tissue type. A viral vector can 
q be modified to have specificity for a given cell type by expressing a ligand as a fusion 
JT 5 protein with a viral coat protein on the outer surface of the virus. The ligand is chosen to 
fy have affinity for a receptor known to be present on the cell type of interest. For example, 
Jj Han et al (1995) Proa Natl. Acad. Sci USA 92:9747-9751 reported that Moloney 
20 murine leukemia virus can be modified to express human heregulin fused to gp70, and 
the recombinant virus infects certain human breast cancer cells expressing human 
epidermal growth factor receptor. This principle can be extended to other pairs of virus 
expressing a ligand fusion protein and target cell expressing a receptor. For example, 
filamentous phage can be engineered to display antibody fragments (e.g., F a b or F v ) 
25 having specific binding affinity for virtually any chosen cellular receptor. Although the 
above description applies primarily to viral vectors, the same principles can be applied to 
non-viral vectors. Such vectors can be engineered to contain specific uptake sequences 
thought to favor uptake by specific target cells. 

Gene therapy vectors can be delivered in vivo by administration to an individual 
30 patient, typically by systemic administration (e.g., intravenous, intraperitoneal, 

intramuscular, subdermal, or intracranial infusion) or topical application, as described 
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infra. Alternatively, vectors can be delivered to cells ex vivo, such as cells explanted 
from an individual patient {e.g., lymphocytes, bone marrow aspirates, tissue biopsy) or 
universal donor hematopoietic stem cells, followed by reimplantation of the cells into a 
patient, usually after selection for cells which have incorporated the vector. 

Ex vivo cell transfection for diagnostics, research, or for gene therapy (e.g., via re- 
infusion of the transfected cells into the host organism) is well known to those of skill in 
the art. In a preferred embodiment, cells are isolated from the subject organism, 
transfected with a nucleic acid (gene or cDNA), and re-infused back into the subject 
organism (e.g., patient). Various cell types suitable for ex vivo transfection are well 
known to those of skill in the art. See, e.g., Freshney et ah, Culture of Animal Cells, A 
Manual of Basic Technique, 3rd ed., 1994, and references cited therein, for a discussion 
of isolation and culture of cells from patients. 

In one embodiment, hematopoietic stem cells are used in ex vivo procedures for 
cell transfection and gene therapy. The advantage to using stem cells is that they can be 
differentiated into other cell types in vitro, or can be introduced into a mammal (such as 
the donor of the cells) where they will engraft in the bone marrow. Methods for 
differentiating CD34+ stem cells in vitro into clinically important immune cell types 
using cytokines such a GM-CSF, IFN-y and TNF-oc are known. Inaba et al (1992) J. 
Exp. Med. 176:1693-1702. 

Stem cells are isolated for transduction and differentiation using known methods. 
For example, stem cells are isolated from bone marrow cells by panning the bone marrow 
cells with antibodies which bind unwanted cells, such as CD4+ and CD8+ (T cells), 
CD45+ (panB cells), GR-1 (granulocytes), and lad (differentiated antigen presenting 
cells). See Inaba et al, supra. 

Vectors (e.g., retroviruses, adenoviruses, liposomes, etc.) containing therapeutic 
nucleic acids can be also administered directly to the organism for transduction of cells in 
vivo. Alternatively, naked DNA can be administered. Administration is by any of the 
routes normally used for introducing a molecule into ultimate contact with blood or tissue 
cells. Suitable methods of administering such nucleic acids are available and well known 
to those of skill in the art, and, although more than one route can be used to administer a 
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particular composition, a particular route can often provide a more immediate and more 
effective reaction than another route. 

Pharmaceutically acceptable carriers are determined in part by the particular 
composition being administered, as well as by the particular method used to administer 
the composition. Accordingly, there are a wide variety of suitable formulations of 
pharmaceutical compositions described herein. See, e.g., Remington 's Pharmaceutical 
Sciences, 17th ed., 1989. 

B. Delivery of Polypeptides 

In additional embodiments, fusion proteins are administered directly to target 
cells. In certain in vitro situations, the target cells are cultured in a medium containing a 
fusion protein comprising one or more functional domains fused to one or more of the 
modified ZFPs described herein. 

An important factor in the administration of polypeptide compounds is ensuring 
that the polypeptide has the ability to traverse the plasma membrane of a cell, or the 
membrane of an intra-cellular compartment such as the nucleus. Cellular membranes are 
composed of lipid-protein bilayers that are freely permeable to small, nonionic lipophilic 
compounds and are inherently impermeable to polar compounds, macromolecules, and 
therapeutic or diagnostic agents. However, proteins, lipids and other compounds, which 
have the ability to translocate polypeptides across a cell membrane, have been described. 

For example, "membrane translocation polypeptides" have amphiphilic or 
hydrophobic amino acid subsequences that have the ability to act as membrane- 
translocating carriers. In one embodiment, homeodomain proteins have the ability to 
translocate across cell membranes. The shortest internalizable peptide of a homeodomain 
protein, Antennapedia, was found to be the third helix of the protein, from amino acid 
position 43 to 58. Prochiantz (1996) Curr. Opin. Neurobiol. 6:629-634. Another 
subsequence, the h (hydrophobic) domain of signal peptides, was found to have similar 

cell membrane translocation characteristics. Unetal. (1995) J. Biol. Chem. 270:14255- 
14258. 

Examples of peptide sequences which can be linked to a non-canonical zinc finger 
polypeptide (or fusion containing the same) for facilitating its uptake into cells include, 
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but are not limited to: an 1 1 amino acid peptide of the tat protein of HIV; a 20 residue 
peptide sequence which corresponds to amino acids 84-103 of the pl6 protein {see 
Fahraeus et al (1996) Curr. Biol 6:84); the third helix of the 60-amino acid long 
homeodomain of Antennapedia (Derossi et al (1994) J. Biol Chem, 269:10444); the h 
5 region of a signal peptide, such as the Kaposi fibroblast growth factor (K-FGF) h region 
(Lin et al, supra); and the VP22 translocation domain from HSV (Elliot et al (1997) 
Cell 88:223-233). Other suitable chemical moieties that provide enhanced cellular 
uptake can also be linked, either covalently or non-covalently, to the ZFPs. 

Toxin molecules also have the ability to transport polypeptides across cell 
10 membranes. Often, such molecules (called "binary toxins") are composed of at least two 
parts: a translocation or binding domain and a separate toxin domain. Typically, the 
Q translocation domain, which can optionally be a polypeptide, binds to a cellular receptor, 

facilitating transport of the toxin into the cell. Several bacterial toxins, including 
/f Clostridium perfringens iota toxin, diphtheria toxin (DT), Pseudomonas exotoxin A (PE), 

§=& 1 5 pertussis toxin (PT), Bacillus anthracis toxin, and pertussis adenylate cyclase (C YA), 
q have been used to deliver peptides to the cell cytosol as internal or amino-terminal 
H; fusions. Arora et al. (1993) J. Biol. Chem. 268:3334-3341; Perelle et al. (1993) Infect. 

jjj Immun. 61:5147-5156; Stenmark et al. (1991) J. Cell Biol. 113:1025-1032; Donnelly et 
|j al. (1993) Proc. Natl. Acad. Sci. USA 90:3530-3534; Carbonetti et al. (1995) Abstr. 
20 Annu. Meet. Am. Soc. Microbiol. 95:295; Sebo et al. (1995) Infect. Immun. 63:3851- 
3857; Klimpel et al. (1992) Proc. Natl. Acad. Sci. USA. 89:10277-10281; and Novak et 
al. (1992)/. Biol. Chem. 267:17186-17193. 

Such subsequences can be used to translocate polypeptides, including the 
polypeptides as disclosed herein, across a cell membrane. This is accomplished, for 
25 example, by derivatizing the fusion polypeptide with one of these translocation 

sequences, or by forming an additional fusion of the translocation sequence with the 
fusion polypeptide. Optionally, a linker can be used to link the fusion polypeptide and the 
translocation sequence. Any suitable linker can be used, e.g., a peptide linker. 

A suitable polypeptide can also be introduced into an animal cell, preferably a 
30 mammalian cell, via liposomes and liposome derivatives such as immunoliposomes. The 
term "liposome" refers to vesicles comprised of one or more concentrically ordered lipid 
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bilayers, which encapsulate an aqueous phase. The aqueous phase typically contains the 
compound to be delivered to the cell 

The liposome fuses with the plasma membrane, thereby releasing the compound 
into the cytosol Alternatively, the liposome is phagocytosed or taken up by the cell in a 
5 transport vesicle. Once in the endosome or phagosome, the liposome is either degraded 
or it fuses with the membrane of the transport vesicle and releases its contents. 

In current methods of drug delivery via liposomes, the liposome ultimately 
becomes permeable and releases the encapsulated compound at the target tissue or cell. 
For systemic or tissue specific delivery, this can be accomplished, for example, in a 
10 passive manner wherein the liposome bilayer is degraded over time through the action of 
H ! various agents in the body. Alternatively, active drug release involves using an agent to 
q induce a permeability change in the liposome vesicle. Liposome membranes can be 

Hi constructed so that they become destabilized when the environment becomes acidic near 

y i 

H the liposome membrane. See, e.g., Proc. Natl. Acad. Sci. USA 84:7851 (1987); 
y : 15 Biochemistry 28:908 (1989). When liposomes are endocytosed by a target cell, for 

example, they become destabilized and release their contents. This destabilization is 
M j termed fiisogenesis. Dioleoylphosphatidylethanolamine (DOPE) is the basis of many 
n\ "fusogenic" systems. 

y For use with the methods and compositions disclosed herein, liposomes typically 

20 comprise a fusion polypeptide as disclosed herein, a lipid component, e.g., a neutral 

and/or cationic lipid, and optionally include a receptor-recognition molecule such as an 

antibody that binds to a predetermined cell surface receptor or ligand {e.g., an antigen). 

A variety of methods are available for preparing liposomes as described in, e.g.; 

U.S. Patent Nos. 4,186,183; 4,217,344; 4,235,871; 4,261,975; 4,485,054; 4,501,728; 
25 4,774,085; 4,837,028; 4,235,871; 4,261,975; 4,485,054; 4,501,728; 4,774,085; 

4,837,028; 4,946,787; PCT Publication No. WO 91/17424; Szokae* al (1980) ^ww. 

Rev. Biophys. Bioeng. 9:467; Deamer et al (1976) Biochim. Biophys. Acta 443:629-634; 

Fraley, et al. (1979) Proc. Natl Acad. Sci USA 76:3348-3352; Hope et al. (1985) 

Biochim. Biophys. Acta 812:55-65; Mayer et al. (1986) Biochim. Biophys. Acta 858:161- 
30 168; Williams et al (1988) Proc. Natl Acad. Sci. USA 85:242-246; Liposomes, Ostro 

(ed.), 1983, Chapter 1); Hope et al (1986) Chem. Phys. Lip. 40:89; Gregoriadis, 
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Liposome Technology (1984) and Lasic, Liposomes: from Physics to Applications (1993). 
Suitable methods include, for example, sonication, extrusion, high 
pressure/homogenization, microfluidization, detergent dialysis, calcium-induced fusion 
of small liposome vesicles and ether-fusion methods, all of which are well known in the 
5 art. 

In certain embodiments, it may be desirable to target a liposome using targeting 
moieties that are specific to a particular cell type, tissue, and the like. Targeting of 
liposomes using a variety of targeting moieties (e.g., ligands, receptors, and monoclonal 
antibodies) has been previously described. See, e.g., U.S. Patent Nos. 4,957,773 and 
10 4,603,044. 

y= Examples of targeting moieties include monoclonal antibodies specific to antigens 

5;: associated with neoplasms, such as prostate cancer specific antigen and MAGE. Tumors can 
Ul also be diagnosed by detecting gene products resulting from the activation or over-expression 
Zj of oncogenes, such as ras or c-erbB2. In addition, many tumors express antigens normally 
15 expressed by fetal tissue, such as the alphafetoprotein (AFP) and carcinoembryonic antigen 
* (CEA). Sites of viral infection can be diagnosed using various viral antigens such as hepatitis 

2 B core and surface antigens (HBVc, HBVs) hepatitis C antigens, Epstein-Barr virus antigens, 

[y human immunodeficiency type-1 virus (HIV-1) and papilloma virus antigens. Inflammation 

nJ 

□ can be detected using molecules specifically recognized by surface molecules which are 
1 ¥ 20 expressed at sites of inflammation such as integrins (e.g., VCAM-1), selectin receptors (e.g., 
ELAM-l)andthe like. 

Standard methods for coupling targeting agents to liposomes are used. These 
methods generally involve the incorporation into liposomes of lipid components, e.g., 
phosphatidylethanolamine, which can be activated for attachment of targeting agents, or 
25 incorporation of derivatized lipophilic compounds, such as lipid derivatized bleomycin. 
Antibody targeted liposomes can be constructed using, for instance, liposomes which 
incorporate protein A. See Renneisen et al. (1990) J. Biol Chem. 265:16337-16342 and 
Leonetti et al (1990) Proa Natl Acad. Set USA 87:2448-2451. 
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Pharmaceutical compositions and administration 

The modified zinc finger proteins and fusion molecules as disclosed herein, and 
expression vectors encoding these polypeptides, can be used in conjunction with various 
methods of gene therapy to facilitate the action of a therapeutic gene product. In such 
applications, the ZFP-containing compositions can be administered directly to a patient, 
e.g., to facilitate the modulation of gene expression and for therapeutic or prophylactic 
applications, for example, cancer (including tumors associated with Wilms' third tumor 
gene), ischemia, diabetic retinopathy, macular degeneration, rheumatoid arthritis, 
psoriasis, HIV infection, sickle cell anemia, Alzheimer's disease, muscular dystrophy, 
neurodegenerative diseases, vascular disease, cystic fibrosis, stroke, and the like. 
Examples of microorganisms whose inhibition can be facilitated through use of the 
methods and compositions disclosed herein include pathogenic bacteria, e.g., Chlamydia, 
Rickettsial bacteria, Mycobacteria, Staphylococci, Streptococci, Pneumococci, 
Meningococci and Conococci, Klebsiella, Proteus, Serratia, Pseudomonas, Legionella, 
Diphtheria, Salmonella, Bacilli (e.g., anthrax), Vibrio (e.g., cholera), Clostridium (e.g., 
tetanus, botulism), Yersinia (e.g., plague), Leptospirosis, and Borrellia (e.g., Lyme 
disease bacteria); infectious fungus, e.g., Aspergillus, Candida species; protozoa such as 
sporozoa (e.g., Plasmodia), rhizopods (e.g., Entamoeba) and flagellates (Trypanosoma, 
Leishmania, Trichomonas, Giardia, e£c.);viruses, e.g., hepatitis (A, B, or C), herpes 
viruses (e.g., VZV, HSV-1, HHV-6, HSV-II, CMV, and EBV), HIV, Ebola, Marburg and 
related hemorrhagic fever-causing viruses, adenoviruses, influenza viruses, flaviviruses, 
echoviruses, rhinoviruses, coxsackie viruses, cornaviruses, respiratory syncytial viruses, 
mumps viruses, rotaviruses, measles viruses, rubella viruses, parvoviruses, vaccinia 
viruses, HTLV viruses, retroviruses, lentiviruses, dengue viruses, papillomaviruses, 
polioviruses, rabies viruses, and arboviral encephalitis viruses, etc. 

Administration of therapeutically effective amounts of modified ZFPs described 
herein, fusion molecules including these ZFPs, or nucleic acids encoding these 
polypeptides, is by any of the routes normally used for introducing polypeptides or 
nucleic acids into ultimate contact with the tissue to be treated. The polypeptides or 
nucleic acids are administered in any suitable manner, preferably with pharmaceutically 
acceptable carriers. Suitable methods of administering such modulators are available and 
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well known to those of skill in the art, and, although more than one route can be used to 
administer a particular composition, a particular route can often provide a more 
immediate and more effective reaction than another route. 

Pharmaceutically acceptable carriers are determined in part by the particular 
composition being administered, as well as by the particular method used to administer 
the composition. Accordingly, there are a wide variety of suitable formulations of 
pharmaceutical compositions. See, e.g., Remington's Pharmaceutical Sciences, 17 th ed. 
1985. 

ZFPs and ZFP fusion polypeptides or nucleic acids, alone or in combination with 
other suitable components, can be made into aerosol formulations (i.e., they can be 
"nebulized") to be administered via inhalation. Aerosol formulations can be placed into 
pressurized acceptable propellants, such as dichlorodifluoromethane, propane, nitrogen, 
and the like. 

Formulations suitable for parenteral administration, such as, for example, by 
intravenous, intramuscular, intradermal, and subcutaneous routes, include aqueous and 
non-aqueous, isotonic sterile injection solutions, which can contain antioxidants, buffers, 
bacteriostats, and solutes that render the formulation isotonic with the blood of the 
intended recipient, and aqueous and non-aqueous sterile suspensions that can include 
suspending agents, solubilizers, thickening agents, stabilizers, and preservatives. 
Compositions can be administered, for example, by intravenous infusion, orally, 
topically, intraperitoneally, intravesically or intrathecally. The formulations of 
compounds can be presented in unit-dose or multi-dose sealed containers, such as 
ampoules and vials. Injection solutions and suspensions can be prepared from sterile 
powders, granules, and tablets of the kind known to those of skill in the art. 

Applications 

The compositions and methods disclosed herein can be used to facilitate a number 
of processes involving transcriptional regulation. These processes include, but are not 
limited to, transcription, replication, recombination, repair, integration, maintenance of 
telomeres, processes involved in chromosome stability and disjunction, and maintenance 
and propagation of chromatin structures. Accordingly, the methods and compositions 
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disclosed herein can be used to affect any of these processes, as well as any other process 
that can be influenced by ZFPs or ZFP fusions. 

In preferred embodiments, one or more of the molecules described herein are used 
to achieve targeted activation or repression of gene expression, e.g., based upon the 
specificity of the modified ZFP. In another embodiment, one or more of the molecules 
described herein are used to achieve reactivation of a gene, for example a 
developmentally silenced gene; or to achieve sustained activation of a transgene. The 
modified ZFP can be targeted to a region outside of the coding region of the gene of 
interest and, in certain embodiments, is targeted to a region outside the regulatory 
region(s) of the gene. In these embodiments, additional molecules, exogenous and/or 
endogenous, can be used to facilitate repression or activation of gene expression. The 
additional molecules can also be fusion molecules, for example, fusions between a ZFP 
and a functional domain such as an activation or repression domain. See, for example, 
co-owned WO 00/41566. 

Accordingly, expression of any gene in any organism can be modulated using the 
methods and compositions disclosed herein, including therapeutically relevant genes, 
genes of infecting microorganisms, viral genes, and genes whose expression is modulated 
in the processes of drug discovery and/or target validation. Such genes include, but are 
not limited to, Wilms' third tumor gene (WT3), vascular endothelial growth factors 
(VEGFs), VEGF receptors (e.g., fit m&flk) CCR-5, low density lipoprotein receptor 
(LDLR), estrogen receptor, HER-2/neu, BRCA-1, BRCA-2, phosphoenolpyruvate 
carboxykinase (PEPCK), CYP7, fibrinogen, apolipoprotein A (ApoA), apolipoprotein B 
(ApoB), renin, phosphoenolpyruvate carboxykinase (PEPCK), CYP7, fibrinogen, nuclear 
factor kB (NF-kB), inhibitor of NF-kB (I-kB), tumor necrosis factors (e.g., TNF-a, TNF- 
P), interleukin-1 (IL-1), FAS (CD95), FAS ligand (CD95L), atrial natriuretic factor, 
platelet-derived factor (PDF), amyloid precursor protein (APP), tyrosinase, tyrosine 
hydroxylase, p-aspartyl hydroxylase, alkaline phosphatase, calpains (e.g., CAPN10) 
neuronal pentraxin receptor, adriamycin response protein, apolipoprotein E (apoE), 
leptin, leptin receptor, UCP-1, IL-1, IL-1 receptor, IL-2, IL-3, IL-4, IL-5, IL-6, IL-12, 
IL-15, interleukin receptors, G-CSF, GM-CSF, colony stimulating factor, erythropoietin 
(EPO), platelet-derived growth factor (PDGF), PDGF receptor, fibroblast growth factor 
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(FGF), FGF receptor, PAF, pl6, pl9, p53, Rb, p21, myc, myb, globin, dystrophin, 
eutrophin, cystic fibrosis transmembrane conductance regulator (CFTR), GNDF, nerve 
growth factor (NGF), NGF receptor, epidermal growth factor (EGF), EGF receptor, 
transforming growth factors (e.g., TGF-a, TGF-p), fibroblast growth factor (FGF), 
interferons (e.g., IFN- a, IFN- p and IFN-y), insulin-related growth factor- 1 (IGF-1), 
angiostatin, ICAM-1, signal transducer and activator of transcription (STAT), androgen 
receptors, e-cadherin, cathepsins (e.g., cathepsin W), topoisomerase, telomerase, bcl, bcl- 
2, Box, T Cell-specific tyrosine kinase (Lck), p38 mitogen-activated protein kinase, 
protein tyrosine phosphatase (hPTP), adenylate cyclase, guanylate cyclase, oc7 neuronal 
nicotinic acetylcholine receptor, 5-hydroxytryptamine (serotonin)-2A receptor, 
transcription elongation factor-3 (TEF-3), phosphatidylcholine transferase,/^ PTI-1, 
polygalacturonase, EPSP synthase, FAD2-1, A-9 desaturase, A-12 desaturase, A-15 
desaturase, acetyl-Coenzyme A carboxylase, acyl-ACP thioesterase, ADP-glucose 
pyrophosphorylase, starch synthase, cellulose synthase, sucrose synthase, fatty acid 
hydroperoxide lyase, and peroxisome proliferator-activated receptors, such as PPAR-y2. 

Expression of human, mammalian, bacterial, fungal, protozoal, Archaeal, plant 
and viral genes can be modulated; viral genes include, but are not limited to, hepatitis 
virus genes such as, for example, HBV-C, HBV-S, HBV-X and HBV-P; and HIV genes 
such as, for example, tat and rev. Modulation of expression of genes encoding antigens 
of a pathogenic organism can be achieved using the disclosed methods and compositions. 

Additional genes include those encoding cytokines, lymphokines, interleukins, growth 
factors, mitogenic factors, apoptotic factors, cytochromes, chemotactic factors, chemokine 
receptors (e.g., CCR-2, CCR-3, CCR-5, CXCR-4), phospholipases (e.g., phospholipase C), 
nuclear receptors, retinoid receptors, organellar receptors, hormones, hormone receptors, 
oncogenes, tumor suppressors, cyclins, cell cycle checkpoint proteins (e.g.,Chkl, Chk2), 
senescence-associated genes, immunoglobulins, genes encoding heavy metal chelators, 
protein tyrosine kinases, protein tyrosine phosphatases, tumor necrosis factor receptor- 
associated factors (e.g., Traf-3, Traf-6), apolipoproteins, thrombic factors, vasoactive factors, 
neuroreceptors, cell surface receptors, G-proteins, G-protein-coupled receptors (e.g., 
substance K receptor, angiotensin receptor, a- and p-adrenergic receptors, serotonin receptors, 
and PAF receptor), muscarinic receptors, acetylcholine receptors, GABA receptors, glutamate 
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receptors, dopamine receptors, adhesion proteins (e.g., CAMs, selectins, integrins and 
immunoglobulin superfamily members), ion channels, receptor-associated factors, 
hematopoietic factors, transcription factors, and molecules involved in signal transduction. 
Expression of disease-related genes, and/or of one or more genes specific to a particular tissue 
or cell type such as, for example, brain, muscle, heart, nervous system, circulatory system, 
reproductive system, genitourinary system, digestive system and respiratory system can also 
be modulated. 

Other applications include therapeutic methods in which a modified ZFP, a ZFP fusion 
polypeptide, or a nucleic acid encoding a modified ZFP or a ZFP fusion is administered to a 
subject and used to modulate the expression of a target gene within the subject (as disclosed, 
for example, in co-owned PCT WO 00/41566). The modulation can be in the form of 
repression, for example, when the target gene resides in a pathological infecting 
microorganism, or in an endogenous gene of the patient, such as an oncogene or viral 
receptor, that is contributing to a disease state. Alternatively, the modulation can be in the 
form of activation, when activation of expression or increased expression of an endogenous 
cellular gene (such as, for example, a tumor suppressor gene) can ameliorate a disease state. 
Exemplary ZFP fusion polypeptides for both activation and repression of gene expression are 
disclosed supra. For such applications, modified ZFPs, ZFP fusion polypeptides or, more 
typically, nucleic acids encoding them are formulated with a pharmaceutically acceptable 
carrier as a pharmaceutical composition. 

Pharmaceutically acceptable carriers and excipients are determined in part by the 
particular composition being administered, as well as by the particular method used to 
administer the composition. See, for example, Remington 's Pharmaceutical Sciences, 17 th 
ed., 1985. ZFPs, ZFP fusion polypeptides, or polynucleotides encoding ZFP fusion 
polypeptides, alone or in combination with other suitable components, can be made into 
aerosol formulations (Le., they can be "nebulized") to be administered via inhalation. Aerosol 
formulations can be placed into pressurized acceptable propellants, such as 
dichlorodifluoromethane, propane, nitrogen, and the like. Formulations suitable for parenteral 
administration, such as, for example, by intravenous, intramuscular, intradermal, and 
subcutaneous routes, include aqueous and non-aqueous, isotonic sterile injection solutions, 
which can contain antioxidants, buffers, bacteriostats, and solutes that render the formulation 
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isotonic with the blood of the intended recipient, and aqueous and non-aqueous sterile 
suspensions that can include suspending agents, solubilizers, thickening agents, stabilizers, 
and preservatives. Compositions can be administered, for example, by intravenous infusion, 
orally, topically, intraperitoneally, intravesically or intrathecally. The formulations of 
compounds can be presented in unit-dose or multi-dose sealed containers, such as ampoules 
and vials. Injection solutions and suspensions can be prepared from sterile powders, granules, 
and tablets of the kind previously described. 

The dose administered to a patient should be sufficient to affect a beneficial 
therapeutic response in the patient over time. The dose is determined by the efficacy and 
binding affinity (Kd) of the particular ZFP employed, the target cell, and the condition of the 
patient, as well as the body weight or surface area of the patient to be treated. The size of the 
dose also is determined by the existence, nature, and extent of any adverse side effects that 
accompany the administration of a particular compound or vector in a particular patient. 

In other applications, modified ZFPs and other DNA- and/or RNA-binding proteins 
are used in diagnostic methods for sequence-specific detection of target nucleic acid in a 
sample. For example, modified ZFPs can be used to detect variant alleles associated with a 
disease or phenotype in patient samples. As an example, modified ZFPs can be used to detect 
the presence of particular mRNA species or cDNA in a complex mixture of mRNAs or 
cDNAs. As a further example, modified ZFPs can be used to quantify the copy number of a 
gene in a sample. For example, detection of loss of one copy of a p53 gene in a clinical 
sample is an indicator of susceptibility to cancer. In a further example, modified ZFPs are 
used to detect the presence of pathological microorganisms in clinical samples. This is 
achieved by using one or more modified ZFPs, as disclosed herein, that bind a target sequence 
in one or more genes within the microorganism to be detected. A suitable format for 
performing diagnostic assays employs modified ZFPs linked to a domain that allows 
immobilization of the ZFP on a solid support such as, for example, a microtiter plate or an 
ELISA plate. The immobilized ZFP is contacted with a sample suspected of containing a 
target nucleic acid under conditions in which binding between the modified ZFP and its target 
sequence can occur. Typically, nucleic acids in the sample are labeled (e.g., in the course of 
PCR amplification). Alternatively, unlabelled nucleic acids can be detected using a second 
labeled probe nucleic acid. After washing, bound, labeled nucleic acids are detected. 
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Labeling can be direct (i.e., the probe binds directly to the target nucleic acid) or indirect (i.e., 
probe binds to one or more molecules which themselves bind to the target). Labels can be, for 
example, radioactive, fluorescent, chemiluminescent and/or enzymatic. 

Modified ZFPs, as disclosed herein, can also be used in assays that link phenotype to 
5 the expression of particular genes. Current methodologies for determination of gene function 
rely primarily upon either over-expressing a gene of interest or removing a gene of interest 
from its natural biological setting, and observing the effects. The phenotypic effects resulting 
from over-expression or knockout are then interpreted as an indication of the role of the gene 
in the biological system. An exemplary animal model system for performing these types of 
10 analysis is the mouse. A transgenic mouse generally contains an introduced gene or has been 
genetically modified so as to up-regulate an endogenous gene. Alternatively, in a "knock-out" 

0 mouse, an endogenous gene has been deleted or its expression has been ablated. There are 

1 ?■ 

m several problems with these existing systems, many of which are related to the fact that it is 
only possible to achieve "all-or-none" modulation of gene expression in these systems. The 
H [j 15 first is the limited ability to modulate expression of the gene under study (e.g., in knock-out 
p mice, the gene under study is generally either absent from the genome or totally non- 
functional; while in transgenic mice which overexpress a particular gene, there is generally a 
single level of overexpression). The second is the oft-encountered requirement for certain 
genes at multiple stages of development. Thus, it is not possible to determine the adult 
20 function of a particular gene, whose activity is also required during embryonic development, 
by generating a knock-out of that gene, since the animals containing the knock-out will not 
survive to adulthood. 

One advantage of using ZFP-mediated regulation of a gene to determine its function, 
relative to the aforementioned conventional knockout analysis, is that expression of a ZFP can 
25 be placed under small molecule control. See, for example, U.S. Patent No. 5,654, 168; 

5,789,156; 5,814,618; 5,888,981; 6,004,941; 6,087,166; 6,136,954; and co-owned WO 
00/41566. By controlling expression levels of the ZFPs, one can in turn control the 
expression levels of a gene regulated by the ZFP to determine what degree of repression or 
stimulation of expression is required to achieve a given phenotypic or biochemical effect. 
30 This approach has particular value for drug development. In addition, placing ZFP expression 
under small molecule control allows one to surmount the aforementioned problems of 
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embryonic lethality and developmental compensation, by switching on expression of the ZFP 
at a later stage in development and observing the effects in the adult animal 

Transgenic mice having target genes regulated by a modified ZFP or a ZFP fusion 
protein can be produced by integration of the nucleic acid encoding the modified ZFP or ZFP 
fusion at any site in trans to the target gene. Accordingly, homologous recombination is not 
required for integration of the ZFP-encoding nucleic acid. Further, because the transcriptional 
regulatory activity of a modified ZFP or ZFP fusion is trans-dommsxit, one is only required to 
obtain animals having one chromosomal copy of a ZFP-encoding nucleic acid. Therefore, 
functional knock-out animals can be produced without backcrossing. 

All references cited herein are hereby incorporated by reference in their entirety for all 
purposes. 

The following examples are presented as illustrative of, but not limiting, the 
claimed subject matter. 

EXAMPLES 

Example 1. Production of non-canonical zinc finger binding proteins 

Synthetic genes encoding non-canonical zinc finger binding proteins are obtained 
following the procedure outlined in co-owned PCT WO 00/42219, with the exception that the 
oligonucleotide encoding the recognition helix to be modified includes a polynucleotide 
sequence that specifies the modified amino acid sequence. For example, for modification of 
finger 3 (the C-terminal-most finger of a three-finger ZFP), the sequence of oligonucleotide 6 
is designed to encode the modified zinc coordination residue(s). 

Example 2. Modulation of expression of the LCK gene with Non-Canonical ZFP 

In this experiment, the designed zinc finger protein "PTP2", which recognizes the 
target sequence GAGGGGGCG and regulates expression of the LCK gene, was modified via 
substitution of the 2 nd histidine in its third finger with cysteine (to yield the protein 
"PTP2(H->C)". Two flanking residues were also changed to glycine to enhance the potential 
of the introduced cysteine to productively coordinate zinc. The sequences of the resultant zinc 
finger proteins were as follows: 

PTP2: 
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Fl PGKKKQHICHIQGCGKVYGRSDELTRHLRWHTGER (SEQ ID NO: 1 12) 
F2 PFMCTWSYCGKRFTRSDHLTRHBCRTHTGEK (SEQ ID NO: 113) 
F3 KFACPE CPKRFMRSDNLTRHIKTHQNKKGGS (SEQ ID NO: 1 14) 

PTP2(H->C): 

Fl PGKKKQHICHIQGCGKVYGRSDELTRHLRWHTGER (SEQ ID NO: 1 1 5) 
F2 PFMCTWSYCGKRFTRSDHLTRHKRTHTGEK (SEQIDNO:116) 

F3 KFACPE — CPKRFMRSDNLTRHIGGCQNKKGGS (SEQ ID NO: 1 1 7) 

Bold and underlines highlight zinc-coordinating residues, and italics highlights 
positions changed in converting PTP2 into PTP2 (H -> C). 



Both ZFPs were expressed in 293 cells as fusions with a nuclear localization signal 
(NLS), VP 16 activation domain, and a FLAG tag. The structure (e.g., order) of the fusion 
proteins were as follows: 



NLS ZFP VP16 [FLAG 



After expression of each protein in 293 cells, cellular levels of the LCK mRNA were 
determined relative to the level of a control RNA (18S RNA) using a PCR based "Taqman" 
assay. RNA levels were also determined for a control protein (NVF) lacking any ZFP (and 
containing only the NLS, VP 16 and FLAG regions). Each experiment was performed in 
duplicate, and the measured RNA ratios are shown in Figure 1 . These ratios indicate that the 
PTP2 ZFP activates expression of the LCK gene, and that the PTP(H->C) ZFP activates LCK 
to even higher levels. These results illustrate the potential of substitutions at zinc- 
coordinating positions to provide ZFPs with enhanced cellular function. As illustrated in 
Figure 1, modification of zinc-coordinating positions can enhance the cellular activity of 
designed zinc finger protein transcription factors. 



Example 3. Modulation of expression of a human VEGF gene with modified 

ZFPs 

This example describes the modification of two VEGF-regulating ZFPs. For each of 
the two ZFPs, a number of non-canonical modified ZFPs were constructed. The proteins wei 
then tested for their ability to regulate VEGF expression and compared with the two C2H2 
parental proteins. 
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Zinc finger proteins comprising a series of C2H2 zinc fingers, and designed to bind to 
the human VEGF-A gene and regulate its expression, have been described. Liu et al. (2001) 
J. Biol Chem. 276:11,323-1 1,334. Two of these ZFPs (named VOP30A and VOP32B), each 
containing three zinc fingers, were converted to non-canonical ZFPs. VOP30A corresponds 
5 to VZ+42/+530 and VOP32B corresponds to VZ+434a in the Liu et al. reference. This was 
accomplished by modifying the third finger of each protein. Seven non-canonical versions of 
each protein were made, each comprising a different non-canonical C2HC third finger. 
Amino acid sequences of portions of the canonical parent ZFPs and each of the non-canonical 
ZFPs, beginning at histidine +7 (with respect to the start of the alpha-helix) of the third finger, 
1 0 are shown in Table 1 . 

U Table 1 



if i 

; : „ : 


NAME 


SEQUENCE 


SEQ ID NO. 




C2H2 


HI KTHQNKKGGS 


11 




S 


HSETGCTKKGGS 


12 




E 


HLKSLTPCTGGS 


13 




K 


HKCGIQNKKGGS 


14 




CT 


HSENCQGKKGGS 


15 




C 


HI KTCQNKKGGS 


16 




GC 


HIKGCQNKKGGS 


17 




GGC 


HIGGCQNKKGGS 


18 



Notes: 

1 . sequences begin at +7 of the alpha helix of the third zinc finger 
15 2. residues involved in metal coordination are bolded and underlined 

3. the first row (protein designated C2H2) shows the sequence of the parental ZFPs 



Human embryonic kidney cells (HEK 293) were transfected with nucleic acids 
encoding non-canonical derivatives of the VOP30A and VOP32B fusion proteins, as well as 
20 the parent (canonical) fusion proteins. The fusion proteins also comprised a VP 16 
transcriptional activation domain, a nuclear localization sequence and an epitope tag. 

The cells were grown in DMEM (Dulbecco's modified Eagle's medium), 
supplemented with 10% fetal bovine serum, in a 5% CO2 incubator at 37°C. Cells were 
plated in 24-well plates at a density of 160,000 cells per well A day later, when the cells 
25 were at approximately 70% confluence, plasmids encoding ZFP-VP16 fusions were 
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introduced into the cells using LipofectAMINE 2000™ reagent (Gibco Life Technologies, 
Rockville, MD) according to the manufacturer's recommendations, using 2 pi 
LipofectAMINE 2000™ and 1 |ag plasmid DNA per well. Medium was removed and 
replaced with fresh medium 16 hours after transfection. Forty hours after transfection, the 
culture medium was harvested and assayed for VEGF-A expression. VEGF-A protein content 
in the culture medium was assayed using a human VEGF ELISA kit (Quanti-Glo, R&D 
Systems, Minneapolis, MN) according to the manufacturer's instructions. 

The results, shown in Figure 2, indicate that C2HC derivatives of both VOP 30A and 
VOP 32B activate VEGF expression and are thus useful as targeted exogenous regulatory 
molecules. 

Example 4. Production of modified plant zinc finger binding proteins 

This example describes a strategy to select amino acid sequences for plant zinc finger 
backbones from among existing plant zinc finger sequences, and subsequent conceptual 
modification of the selected plant zinc finger amino acid sequences to optimize their DNA 
binding ability. Oligonucleotides used in the preparation of polynucleotides encoding 
proteins containing these zinc fingers in tandem array are then described. 

A. Selection of plant zinc finger backbones 

A search was conducted for plant zinc fingers whose backbone sequences (i.e., the 
portion of the zinc finger outside of the -1 through +6 portion of the recognition helix) 
resembled that of the SP-1 consensus sequence described by Berg (1992) Proc. Natl. Acad. 
Sci. USA 89: 1 1 , 1 09- 1 1 , 1 1 0. The sequences selected included the two conserved cysteine 
residues, a conserved basic residue (lysine or arginine) located two residues to the C-terminal 
side of the second (i.e. C-terminal) cysteine, a conserved phenylalanine residue located two 
residues to the C-terminal side of the basic residue, the two conserved histidine residues, and a 
conserved arginine residue located two residues to the C-terminal side of the first (i.e., N- 
terminal) conserved histidine. The amino acid sequences of these selected plant zinc finger 
backbones (compared to the SP-1 consensus sequence) are shown below, with conserved 
residues shown in bold and X referring to residues located at positions -1 through +6 in the 
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recognition helix (which will differ among different proteins depending upon the target 
sequence): 



SP-l 


consensus : YKCPECGKSFSXXXXXXXHQRTHTGEKP 


(SEQ 


ID 


NO: 


19) 


Fl: 


KKKSKGHECPICFRVFKXXXXXXXHKRSHTGEKP 


(SEQ 


ID 


NO: 


20) 


F2 


YKCTVCGKSFSXXXXXXXHKRLHTGEKP 


(SEQ 


ID 


NO: 


21) 


F3 


FSCNYCQRKFYXXXXXXXHVRIH 


(SEQ 


ID 


NO: 


22) 



-5 -1 5 

The first finger (Fl) was chosen because it contained a basic sequence N-terminal to the 
finger that is also found adjacent to the first finger of SP-1. The finger denoted Fl is a Petunia 
sequence, the F2 and F3 fingers are Arabidopsis sequences. 

B. Modification of plant zinc finger backbones 

Two of the three plant zinc fingers (Fl and F3, above) were modified so that their amino 
acid sequences more closely resembled the sequence of SP-1, as follows. (Note that the 
sequence of SP-1 is different from the sequence denoted "SP-1 consensus.") In F3, the Y residue 
at position -2 was converted to a G, and the sequence QNKK (SEQ ID NO:23) was added to the 
C-terminus of F3. The QNKK (SEQ ID NO:23) sequence is present C-terminal to the third 
finger of SP-1, and permits greater flexibility of that finger, compared to fingers 1 and 2, which 
are flanked by the helix-capping sequence T G E K/RK/P (SEQ ID NO:24). Such flexibility can 
be particularly beneficial when the third finger is modified to contain a non-C 2 H 2 backbone, as 
described herein. Finally, several amino acids were removed from the N-terminus of Fl . The 
resulting zinc finger backbones had the following sequences: 

KSKGHECP I CFRVFKXXXXXXXHK RSHTGE KP (SEQ ID NO: 25) 
YKCTVCGKSFSXXXXXXXHKR LHTGE KP (SEQ ID NO: 26) 
F S CN YCQRKFGXXXXXXXHVR I HQNKK (SEQ ID NO: 2 7) 

Amino acid residues denoted by X, present in the recognition portion of these zinc 
fingers, are designed or selected depending upon the desired target site, according to methods 
disclosed, for example, in co-owned WO 00/41566 and WO 00/42219, and/or references cited 
supra. 
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C. Nucleic acid sequences encoding backbones for modified plant ZFPs 
The following polynucleotide sequences were used for design of three-finger plant ZFPs 
that contain the Fl, F2 and F3 backbones described above. Polynucleotides encoding multi- 
5 finger ZFPs were designed according to an overlapping oligonucleotide method as described in, 
for example, co-owned WO 00/41566 and WO 00/42219. Oligonucleotides HI, H2 and H3 
(below) comprise sequences corresponding to the reverse complement of the recognition helices 
of fingers 1-3 respectively; accordingly, nucleotides denoted by N vary depending upon the 
desired amino acid sequences of the recognition helices, which, in turn, depend upon the 
1 0 nucleotide sequence of the target site. Oligonucleotides PB 1 , PB2 and PB3 encode the beta- 
sheet portions of the zinc fingers, which are common to all constructs. Codons used frequently 
in Arabidopsis and E. coli were selected for use in these oligonucleotides. 



II HI: 

Jl5 5 '-CTC ACC GGT GTG AGA ACG CTT GTG NNN NNN NNN NNN NNN NNN NNN CTT 

U GAA AAC ACG GAA-3 ' 

= (SEQ ID NO:28) 

P H2: 

W20 5 '-TTC ACC AGT ATG AAG ACG CTT ATG NNN NNN NNN NNN NNN NNN NNN AGA 

W AAA AGA CTT ACC-3 ' 

|) (SEQIDNO:29) 

H3: 

25 5' -CTT CTT GTT CTG GTG GAT ACG CAC GTG NNN NNN NNN NNN NNN NNN NNN 
ACC GAA CTT ACG CTG-3' 
(SEQ ID NO:30) 

PB1: 

30 5 ' - AAGTCTAAGGGTC ACGAGTGCCC AATCTGCTTCCGTGTTTTC AAG-3 ' 
(SEQIDNO:31) 

PB2: 

5 ' -TCTC AC ACCGGTG AGAAGCC AT AC AAGTGC ACTGTTTGTGGTAAGTCTTTTTCT-3 ' 
35 (SEQ ID NO:32) 

PB3: 

5 ' -CTTC AT ACTGGTGAAAAGCC ATTCTCTTGC AACTACTGCC AGCGTAAGTTCGGT-3 ' 
(SEQ ID NO:33) 

40 
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Briefly, these six oligonucleotides are annealed and amplified by polymerase chain 
reaction. The initial amplification product is reamplified using primers that are complementary 
to the initial amplification product and that also contain 5 ' extensions containing restriction 
enzyme recognition sites, to facilitate cloning. The second amplification product is inserted into 
a vector containing, for example, one or more functional domains, nuclear localization 
sequences, and/or epitope tags. See, for example, co-owned WO 00/41566 and WO 00/42219. 

Example 5. Construction of a polynucleotide encoding a modified plant zinc finger 
protein for binding to a predetermined target sequence 

A modified plant zinc finger protein was designed to recognize the target sequence 
5 ' -GAGGGGGCG-3 ' . Recognition helix sequences for Fl, F2 and F3 were determined, as 
shown in Table 2, and oligonucleotides corresponding to HI, H2 and H3 above, also including 
sequences encoding these recognition helices, were used for PCR assembly as described above. 



Table 2 



Finger 


Target 


Helix sequence 


Nucleotide sequence for PCR assembly 


Fl 


GCG 


RSDELTR 
SEQIDNO:109 


5'CTCACCGGTGTGAGAACGCTTGTGACGGGTCAACT 
CGTCAGAACGCTTGAAAACACGGAA-3 ' (SEQ ID NO:34) 


F2 


GGG 


RSDHLTR 
SEQ ID NO: 110 


5'TTCACCAGTATGAAGACGCTTATGACGGGTCAAGT 
GGTCAGAACGAGAAAAAGACTTACC-3 ' (SEQ ID NO:35) 


F3 


GAG 


RSDNLTR 
SEQIDNO:lll 


5'CTTCTTGTTCTGGTGGATACGCACGTGACGGGTCA 
AGTTGTCAGAACGACCGAACTTACGCTG-3 ' (SEQ ID NO:36) 



Subsequent to the initial amplification, a secondary amplification was conducted, as 
described above, using the following primers: 

PZF: 5 '-CGGGGTACC AGGT AAGTCTAAGGGTCAC (SEQ ID NO:37) 



PZR: 5 '-GCGCGGATCCACCCTTCTTGTTCTGGTGGATACG (SEQ ID NO:38). 

PZF includes a Kpnl site (underlined) and overlaps the PB1 sequence (overlap indicated 
in bold). PZR includes a BamHI (underlined) site and overlaps with H3 (indicated in bold). 

The secondary amplification product is digested with Kpn I and Bam HI and inserted into 
an appropriate vector (e.g., YCF3, whose construction is described below) to construct an 
expression vector encoding a modified plant ZFP fused to a functional domain, for modulation of 
gene expression in plant cells. 
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Example 6. Construction of Vectors for Expression of Modified Plant ZFPs 

YCF3 was generated as shown in Figure 3. The starting construct was a plasmid 
containing a CMV promoter, a SV40 nuclear localization sequence (NLS), a ZFP DNA 
binding domain, a Herpesvirus VP 16 transcriptional activation domain and a FLAG epitope 
tag (pSB5186-NVF). This construct was digested with Spel to remove the CMV promoter. 
The larger fragment was gel-purified and self-ligated to make a plasmid termed GF1. GF1 
was then digested with Kpnl and Hindlll, releasing sequences encoding the ZFP domain, the 
VP 16 activation domain, and the FLAG epitope tag, then the larger fragment was ligated to a 
KpnI/HindHI fragment containing sequences encoding a ZFP binding domain and a VP 16 
activation domain, named GF2. This resulted in deletion of sequences encoding the FLAG 
tag from the construct. 

GF2 was digested with BamHI and Hindlll, releasing a small fragment encoding the 
VP 16 activation domain, and the larger fragment was purified and ligated to a BamHI/Hindlll 
digested PCR fragment containing the maize CI activation domain (Goff et at. (1990) EMBO 
J. 9:2517-2522) (Kpnl and Hindlll sites were introduced into the PCR fragment through Kpnl 
and Hindlll site-containing primers) to generate NCF1. A PCR fragment containing a Maize 
Opaque-2 NLS was digested with Spel/Kpnl and ligated to the larger fragment from 
KpnI/Spel digested NCF1 to produce YCF2. YCF2 was then digested with Mlul and Spel 
and the larger fragment was ligated to an Mlul and Spel digested PCR fragment containing 
the plant-derived CaMV 35S promoter (Mlul and Spel sites were introduced into the PCR 
fragment through Mlul or Spel site containing primers) to generate the YCF3 vector. 

Sequences encoding modified plant ZFP binding domains can be inserted, as 
KpnI/BamHI fragments, into KpnI/BamHI-digested YCF3 to generate constructs encoding 
ZFP-functional domain fusion proteins for modulation of gene expression in plant cells. For 
example, a series of modified plant ZFP domains, described in Example 5 infra, were inserted 
into KpnI/BamHI-digested YCF3 to generate expression vectors encoding modified plant 
ZFP-activation domain fusion polypeptides that enhance expression of the Arabidopsis 
thaliana GMT gene. 
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Example 7. Modified ZFP Designs for Regulation of an Arabidopsis thaliana 
gamma tocopherol methyltransferase (GMT) Gene 

Modified zinc finger proteins were designed to recognize various target sequences in 
the Arabidopsis GMT gene (GenBank Accession Number AAD38271). These proteins were 
modified in two ways. First, they contained a plant backbone as described in Example 4. 
Second, they contained a non-canonical (C 2 HC) third zinc finger in which the second zinc 
coordinating histidine of a canonical C 2 H 2 structure was converted to a cysteine. Table 3 
shows the nucleotide sequences of the various GMT target sites, and the amino acid sequences 
of zinc fingers that recognize the target sites. Sequences encoding these binding domains 
were prepared as described in Example 4 and inserted into YCF3 as described in Example 6. 
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Table 3 



ZFP# 


Target 


Fl 


F2 


F3 


1 


GTGGACGAGT 
(SEQ ID NO:39) 


RSDNLAR 
(SEQ ID NO:40) 


DRSNLTR 
(SEQ ID NO: 41) 


RSDALTR 
(SEQ ID NO:42) 


2 


CGGGATGGGT 
(SEQ ID NO:43) 


RSDHLAR 
(SEQ ID NO: 44) 


TSGNLVR 
(SEQ ID NO:45) 


RSDHLRE 
(SEQ ID NO:46) 


3 


TGGTGGGTGT 
(SEQ ID NO:47) 


RSDALTR 
(SEQ ID NO:48) 


RSDHLTT 
(SEQ ID NO: 49) 


RSDHLTT 
(SEQ ID NO:50) 


4 


GAAGAGGATT 
(SEQ ID NO:51) 


QSSNLAR 
(SEQ ID NO:52) 


RSDNLAR 
(SEQ ID NO: 53) 


QSGNLTR 
(SEQ ID NO: 54) 


5 


GAGGAAGGGG 
(SEQ ID NO: 55) 


RSDHLAR 
(SEQ ID NO:56) 


QSGNLAR 
(SEQ ID NO:57) 


RSDNLTR 
(SEQ ID NO: 58) 


6 


TGGGTAGTC 
(SEQ ID NO:59) 


ERGTLAR 
(SEQ ID NO: 60) 


QSGSLTR 
(SEQ ID NO: 61) 


RSDHLTT 
(SEQ ID NO: 62) 


7 


GGGGAAAGGG 
(SEQ ID NO: 63) 


RSDHLTQ 
(SEQ ID NO: 64) 


QSGNLAR 
(SEQ ID NO: 65) 


RSDHLSR 
(SEQ ID NO: 66) 




GAAGAGGGTG 
(SEQ ID NO: 67) 


QSSHLAR 
(SEQ ID NO:68) 


RSDNLAR 
(SEQ ID NO: 69) 


QSGNLAR 
(SEQ ID NO: 70) 


9 


GAGGAGGATG 
(SEQ ID NO: 71) 


QSSNLQR 
(SEQ ID NO: 72) 


RSDNALR 
(SEQ ID NO: 73) 


RSDNLQR 
(SEQ ID NO: 74) 


10 


GAGGAGGAGG 
(SEQ ID NO: 75) 


RSDNALR 
(SEQ ID NO:76) 


RSDNLAR 
(SEQ ID NO: 77) 


RSDNLTR 
(SEQ ID NO: 78) 


11 


GTGGCGGCTG 
(SEQ ID NO: 79) 


QSSDLRR 
(SEQ ID NO:80) 


RSDELQR 
(SEQ ID NO: 81) 


RSDALTR 
(SEQ ID NO: 82) 


12 


TGGGGAGAT 
(SEQ ID NO: 83) 


QSSNLAR 
(SEQ ID NO:84) 


QSGHLQR 
(SEQ ID NO:85) 


RSDHLTT 
(SEQ ID NO: 86) 


13 


GAGGAAGCT 
(SEQ ID NO: 87) 


QSSDLRR 
(SEQ ID NO: 88) 


QSGNLAR 
(SEQ ID NO: 89) 


RSDNLTR 
(SEQ ID NO: 90) 


14 


GCTTGTGGCT 
(SEQ ID NO: 91) 


DRSHLTR 
(SEQ ID NO: 92) 


TSGHLTT 
(SEQ ID NO: 93) 


QSSDLTR 
(SEQ ID NO: 94) 


15 


GTAGTGGATG 
(SEQ ID NO: 95) 


QSSNLAR 
(SEQ ID NO: 96) 


RSDALSR 
(SEQ ID NO:97) 


QSGSLTR 
(SEQ ID NO: 98) 


16 


GTGTGGGATT 
(SEQ ID NO: 99) 


QSSNLAR 
(SEQ IDNO:100) 


RSDHLTT 
(SEQ IDNO:101) 


RSDALTR 
(SEQ IDNO:102) 
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Example 8: Modulation of Expression of an Arabidopsis thaliana gamma 
tocopherol methyltransferase (GMT) Gene 

Arabidopsis thaliana protoplasts were prepared and transfected with plasmids 
encoding modified ZFP-activation domain fusion polypeptides. Preparation of protoplasts 
and polyethylene glycol-mediated transfection were performed as described. Abel et al 
(1994) Plant Journal 5:421-427. The different plasmids contained the modified plant ZFP 
binding domains described in Table 3, inserted as KpnI/BamHI fragments into YCF3. 

At 18 hours after transfection, RNA was isolated from transfected protoplasts, using 
an RNA extraction kit from Qiagen (Valencia, CA) according to the manufacturer's 
instructions. The RNA was then treated with DNase (RNase-free), and analyzed for GMT 
mRNA content by real-time PCR (TaqMan®), Table 4 shows the sequences of the primers 
and probe used for TaqMan® analysis. Results for GMT mRNA levels were normalized to 
levels of 18S rRNA. These normalized results are shown in Figure 4 as fold- activation of 
GMT mRNA levels, compared to protoplasts transfected with carrier DNA (denoted "No 
ZFP" in Figure 4). The results indicate that expression of the GMT gene was enhanced in 
protoplasts that were transfected with plasmids encoding fusions between a transcriptional 
activation domain and a modified plant ZFP binding domain targeted to the GMT gene. 



Table 4 





SEQUENCE 


GMT forward 
primer 


5 ' - AATGATCTCGCGGCTGCT-3 ' (SEQ ID NO: 103) 


GMT reverse primer 


5 '-GAATGGCTGATCCAACGCAT-3 ' (SEQ ID NO: 104) 


GMT probe 


5 ' -TCACTCGCTCATAAGGCTTCCTTCCA AGT-3 ' (SEQ ID NO: 105) 


18S forward primer 


5 ' -TGC AAC AAACCCCG ACTTATG-3 ' (SEQ ID NO:106) 


18S reverse primer 


5'-CCCGCGTCGACCTTTTATC-3' (SEQ ID NO: 107) 


18S probe 


5 ' -AATAAATGCGTCCCTT-3 ' (SEQ ID NO:108) 



Although the foregoing methods and compositions have been described in detail for 
purposes of clarity of understanding, certain modifications, as known to those of skill in the 
art, can be practiced within the scope of the appended claims. All publications and patent 
documents cited herein are hereby incorporated by reference in their entirety for all purposes 
to the same extent as if each were so individually denoted. 
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